Institutions
|
About Us
|
Help
|
Gaeilge
0
1000
Home
Browse
Advanced Search
Search History
Marked List
Statistics
A
A
A
Author(s)
Institution
Publication types
Funder
Year
Limited By:
Subject = Data Quality;
22 items found
Sort by
Title
Author
Item type
Date
Institution
Peer review status
Language
Order
Ascending
Descending
25
50
100
per page
Bibtex
CSV
EndNote
RefWorks
RIS
XML
Displaying Results 1 - 22 of 22 on page 1 of 1
Marked
Mark
Challenges for Value-Driven Semantic Data Quality Management
(2017)
BRENNAN, ROB
Challenges for Value-Driven Semantic Data Quality Management
(2017)
BRENNAN, ROB
http://hdl.handle.net/2262/79953
Marked
Mark
A data quality framework for process mining of electronic health record data
(2018)
Fox, Frank; Aggarwal, Vishal R.; Whelton, Helen; Johnson, Owen
A data quality framework for process mining of electronic health record data
(2018)
Fox, Frank; Aggarwal, Vishal R.; Whelton, Helen; Johnson, Owen
Abstract:
Reliable research demands data of known quality. This can be very challenging for electronic health record (EHR) based research where data quality issues can be complex and often unknown. Emerging technologies such as process mining can reveal insights into how to improve care pathways but only if technological advances are matched by strategies and methods to improve data quality. The aim of this work was to develop a care pathway data quality framework (CP-DQF) to identify, manage and mitigate EHR data quality in the context of process mining, using dental EHRs as an example. Objectives: To: 1) Design a framework implementable within our e-health record research environments; 2) Scale it to further dimensions and sources; 3) Run code to mark the data; 4) Mitigate issues and provide an audit trail. Methods: We reviewed the existing literature covering data quality frameworks for process mining and for data mining of EHRs and constructed a unified data quality framework that met the...
http://hdl.handle.net/10468/6700
Marked
Mark
A rule based approach to data certification - applying DQXML for system independent data certification
(2013)
Hossain, Fakir
A rule based approach to data certification - applying DQXML for system independent data certification
(2013)
Hossain, Fakir
Abstract:
Many researchers and practitioners have been attracted to improve data quality due to its monumental importance as a key success factor. Mathematical and statistical models have been deployed to information systems to introduce constrain and transaction based mechanisms to prevent data quality related problems. Entire management of the process and roles involved in data generation has also been scrutinized. Vast amount of knowledge base progressed in this area are mostly limited from practical perspective. Quality related meta data is absent from most information systems. Neither process mapping nor data modelling provides sufficient provision to measure quality or certification of data in the information systems. Furthermore, on-going monitoring of data for quality conformance through a separate process is expensive and time consuming. Recognising this limitation and aiming to provide a practical-orient comprehensive approach, I propose a process centric quality focused solution in...
http://doras.dcu.ie/19404/
Marked
Mark
An Exploration of the Relationship between the Partisan-Business Cycle and Economic Inequality within Developed Economies
(2016)
O'Doherty, Richard
An Exploration of the Relationship between the Partisan-Business Cycle and Economic Inequality within Developed Economies
(2016)
O'Doherty, Richard
Abstract:
Recent contributions to the study of inequality have provided strong evidence towards the presence of an established trend, over several decades, of growing economic inequality (with a particular focus on distribution within their tails; i.e. top 10%, 1%) across countries with developed economies and indications of similar trends across developing economies. While the causality and influencing factors to these trends has widely been discussed, and has range from declining domestic growth rates as economies move towards high mass consumption states to globalisation, political decision making and policy application been referred to as both contributory or an instrument for dampening such trends, through the application of Partisan Business Cycles. However contemporary studies have indicated a declining correlation between political orientation of national legislatures or executive branches and domestic economic and financial performance (which by extension, through it strong correlati...
https://arrow.dit.ie/scschcomdis/82
Marked
Mark
Assessing the quality of geospatial linked data – experiences from Ordnance Survey Ireland (OSi)
(2018)
Debattista, Jeremy; Clinton, Eamon; Brennan, Rob
Assessing the quality of geospatial linked data – experiences from Ordnance Survey Ireland (OSi)
(2018)
Debattista, Jeremy; Clinton, Eamon; Brennan, Rob
Abstract:
Ordnance Survey Ireland (OSi) is Ireland’s national mapping agency that is responsible for the digitisation of the island’s infrastructure in terms of mapping. Generating data from various sensors (e.g. spatial sensors), OSi build its knowledge in the Prime2 framework, a subset of which is transformed into geo-Linked Data. In this paper we discuss how the quality of the generated sematic data fares against datasets in the LOD cloud. We set up Luzzu, a scalable Linked Data quality assessment framework, in the OSi pipeline to continuously assess produced data in order to tackle any quality problems prior to publishing.
http://doras.dcu.ie/22977/
Marked
Mark
Automated Highway Tag Assessment of OpenStreetMap Road Networks
(2014)
Jilani, Musfira; Corcoran, Padraig; Bertolotto, Michela
Automated Highway Tag Assessment of OpenStreetMap Road Networks
(2014)
Jilani, Musfira; Corcoran, Padraig; Bertolotto, Michela
Abstract:
22nd ACM SIGSPATIAL (International Conference on Advances in Geographic Information Systems), Dallas, Texas, USA, 4-7 November, 2014
OpenStreetMap (OSM) has been demonstrated to be a valuable source of spatial data in the context of many applications. However concerns still exist regarding the quality of such data and this has limited the proliferation of its use. Consequently much research has been invested in the development of methods for assessing and/or improving the quality of OSM data. However most of these methods require ground-truth data, which, in many cases, may not be available. In this paper we present a novel solution for OSM data quality assessment that does not require ground-truth data. We consider the semantic accuracy of OSM street network data, and in particular, the associated semantic class (road class) information. A machine learning model is proposed that learns the geometrical and topological characteristics of di erent semantic classes of streets. This...
http://hdl.handle.net/10197/6125
Marked
Mark
Data Quality Problems and Proactive Data Quality Management in Data-Warehouse-Systems
(2002)
Helfert, Markus; Zellner, Gregor; Sousa, Carlos
Data Quality Problems and Proactive Data Quality Management in Data-Warehouse-Systems
(2002)
Helfert, Markus; Zellner, Gregor; Sousa, Carlos
Abstract:
The abstract is included in the text.
http://mural.maynoothuniversity.ie/12873/
Marked
Mark
Data quality problems in TPC-DI based data integration processes
(2018)
Yang, Qishan; Ge, Mouzhi; Helfert, Markus
Data quality problems in TPC-DI based data integration processes
(2018)
Yang, Qishan; Ge, Mouzhi; Helfert, Markus
Abstract:
Many data driven organisations need to integrate data from multiple, distributed and heterogeneous resources for advanced data analysis. A data integration system is an essential component to collect data into a data warehouse or other data analytics systems. There are various alternatives of data integration systems which are created in-house or provided by vendors. Hence, it is necessary for an organisation to compare and benchmark them when choosing a suitable one to meet its requirements. Recently, the TPC-DI is proposed as the first industrial benchmark for evaluating data integration systems. When using this benchmark, we find some typical data quality problems in the TPC-DI data source such as multi-meaning attributes and inconsistent data schemas, which could delay or even fail the data integration process. This paper explains processes of this benchmark and summarises typical data quality problems identified in the TPC-DI data source. Furthermore, in order to prevent data q...
http://doras.dcu.ie/22315/
Marked
Mark
Development of a Climate Forcing Observation System for Africa: Data-Related Considerations
(2019)
Saunders, Matthew; Beck, Johannes; L?pez-Ballesteros, Ana; Hugo, Wim; Scholes, Robert; ...
Development of a Climate Forcing Observation System for Africa: Data-Related Considerations
(2019)
Saunders, Matthew; Beck, Johannes; L?pez-Ballesteros, Ana; Hugo, Wim; Scholes, Robert; Helmschrot, J?rg
Abstract:
In the case of the African continent, the estimates of most climate forcing components are associated with large uncertainties, above all the greenhouse gas budget. The EU-funded SEACRIFOG project is designing an observation network which aims at reducing these uncertainties. In this practice paper, we present the various steps towards the design of this network and discuss the data-related implications. This includes the formulation of appropriate observational requirements for each variable considered essential to quantify Africa-wide climate forcing as well as an assessment of corresponding available observational infrastructures and data in order to determine data gaps, needs and priorities. The results are intended to inform the design of an interoperable African data infrastructure for environmental observations.
http://hdl.handle.net/2262/91200
Marked
Mark
Discovering Dynamic Integrity Rules with a Rules-Based Tool for Data Quality Analyzing
(2010)
Pham Thi, Thanh Thoa; Helfert, Markus
Discovering Dynamic Integrity Rules with a Rules-Based Tool for Data Quality Analyzing
(2010)
Pham Thi, Thanh Thoa; Helfert, Markus
Abstract:
Rules based approaches for data quality solutions often use business rules or integrity rules for data monitoring purpose. Integrity rules are constraints on data derived from business rules into a formal form in order to allow computerization. One of challenges of these approaches is rules discovering, which is usually manually made by business experts or system analysts based on experiences. In this paper, we present our rule-based approach for data quality analyzing, in which we discuss a comprehensive method for discovering dynamic integrity rules.
http://mural.maynoothuniversity.ie/7017/
Marked
Mark
Enhancing the Utility of Anonymized Data by Improving the Quality of Generalization Hierarchies
(2018)
Ayala-Rivera, Vanessa; McDonagh, Patrick; Cerqueus, Thomas; Murphy, Liam, B.E.; Thorpe,...
Enhancing the Utility of Anonymized Data by Improving the Quality of Generalization Hierarchies
(2018)
Ayala-Rivera, Vanessa; McDonagh, Patrick; Cerqueus, Thomas; Murphy, Liam, B.E.; Thorpe, Christina
Abstract:
The dissemination of textual personal information has become an important driver of innovation. However, due to the possible content of sensitive information, this data must be anonymized. A commonly-used technique to anonymize data is generalization. Nevertheless, its effectiveness can be hampered by the Value Generalization Hierarchies (VGHs) used as poorly-specified VGHs can decrease the usefulness of the resulting data. To tackle this problem, in our previous work we presented the Generalization Semantic Loss (GSL), a metric that captures the quality of categorical VGHs in terms of semantic consistency and taxonomic organization. We validated the accuracy of GSL using an intrinsic evaluation with respect to a gold standard ontology. In this paper, we extend our previous work by conducting an extrinsic evaluation of GSL with respect to the performance that VGHs have in anonymization (using data utility metrics). We show how GSL can be used to perform an a priori assessment of the...
http://hdl.handle.net/10197/9317
Marked
Mark
GazeVisual: A practical software tool and web application for performance evaluation of eye tracking systems
(2019)
Kar, Anuradha; Corcoran, Peter
GazeVisual: A practical software tool and web application for performance evaluation of eye tracking systems
(2019)
Kar, Anuradha; Corcoran, Peter
Abstract:
The concept and functionalities of a software tool developed for in depth performance evaluation of eye gaze estimation systems is presented. The software, GazeVisual has capabilities for quantitative, statistical, and visual analysis of eye gaze data as well as generation of static and dynamic visual stimuli for sample gaze data collection. This is a first of its kind cross-platform tool for gaze data analysis and evaluation. This software is made freely available to the eye gaze research and development community to provide a common framework for estimating the quality and reliability of data from eye tracking systems, especially those implemented in consumer electronics (CE) applications. The feasibility of using this software is tested through case studies which show that the software can handle eye gaze datasets obtained from several different consumer grade eye trackers. GazeVisual operates consistently, irrespective of the platform, algorithm or hardware of the eye trackers. ...
http://hdl.handle.net/10379/15475
Marked
Mark
Guidelines of data quality issues for data integration in the context of the TPC-DI benchmark
(2017)
Yang, Qishan; Ge, Mouzhi; Helfert, Markus
Guidelines of data quality issues for data integration in the context of the TPC-DI benchmark
(2017)
Yang, Qishan; Ge, Mouzhi; Helfert, Markus
Abstract:
Nowadays, many business intelligence or master data management initiatives are based on regular data integration, since data integration intends to extract and combine a variety of data sources, it is thus considered as a prerequisite for data analytics and management. More recently, TPC-DI is proposed as an industry benchmark for data integration. It is designed to benchmark the data integration and serve as a standardisation to evaluate the ETL performance. There are a variety of data quality problems such as multi-meaning attributes and inconsistent data schemas in source data, which will not only cause problems for the data integration process but also affect further data mining or data analytics. This paper has summarised typical data quality problems in the data integration and adapted the traditional data quality dimensions to classify those data quality problems. We found that data completeness, timeliness and consistency are critical for data quality management in data inte...
http://doras.dcu.ie/21814/
Marked
Mark
Improving the Utility of Anonymized Datasets through Dynamic Evaluation of Generalization Hierarchies
(2017)
Ayala-Rivera, Vanessa; Cerqueus, Thomas; Murphy, Liam, B.E.; Thorpe, Christina
Improving the Utility of Anonymized Datasets through Dynamic Evaluation of Generalization Hierarchies
(2017)
Ayala-Rivera, Vanessa; Cerqueus, Thomas; Murphy, Liam, B.E.; Thorpe, Christina
Abstract:
IEEE 17th International Conference on Information Reuse and Integration (IRI), Pittsburgh, PA, USA, July, 2016
The dissemination of textual personal information has become a key driver for innovation and value creation. However, due to the possible content of sensitive information, this data must be anonymized, which can reduce its usefulness for secondary uses. One of the most used techniques to anonymize data is generalization. However, its effectiveness can be hampered by the Value Generalization Hierarchies (VGHs) used to dictate the anonymization of data, as poorly-specified VGHs can reduce the usefulness of the resulting data. To tackle this problem, we propose a metric for evaluating the quality of textual VGHs used in anonymization. Our evaluation approach considers the semantic properties of VGHs and exploits information from the input datasets to predict with higher accuracy (compared to existing approaches) the potential effectiveness of VGHs for anonymizing data. As ...
http://hdl.handle.net/10197/8767
Marked
Mark
Information quality and diverse information systems situations
(2011)
Foley, Owen
Information quality and diverse information systems situations
(2011)
Foley, Owen
Abstract:
Information quality is a recurring problem that many organisations contend with. Despite investment in both technology, and the renement of information systems, the problem persists. Information systems deployment has; in recent years undergone radical change; the traditional deployment where the architecture, user and access device were known at the time of development, have been replaced by more diverse situations. These diverse situations include web interfaces, traditional client server and a mobile devices revolution. The aim of our research is to improve information quality assessment by catering for diverse information systems situations by the design and construction of a method. Several information quality frameworks have been developed to cater for these new and evolving information systems. The expansion of frameworks across a large number of domains presents problems with respect to: framework choice, appropriateness, validity and users perceptions of information quality...
http://doras.dcu.ie/16432/
Marked
Mark
Integration of multiple network views in Wikipedia
(2015)
Wu, Guangyu; Cunningham, Pádraig
Integration of multiple network views in Wikipedia
(2015)
Wu, Guangyu; Cunningham, Pádraig
Abstract:
One of the challenges in network data analysis is the determination of the mostinformative perspective on the network to use in analysis. This is particularlyan issue when the network is dynamic and is defined by events that occur overtime. We present an example of such a scenario in the analysis of edit networks in Wikipedia the networks of editors interacting on Wikipedia pages. We proposethe prediction of article quality as a task that allows us to quantify the informativenessof alternative network views. We present three fundamentally different viewson the data that attempt to capture structural and temporal aspects of the edit networks.We demonstrate that each view captures information that is unique to thatview and propose a strategy for integrating the different sources of information
Science Foundation Ireland
http://hdl.handle.net/10197/6416
Marked
Mark
Is the LOD cloud at risk of becoming a museum for datasets? Looking ahead towards a fully collaborative and sustainable LOD cloud.
(2019)
Debattista, Jeremy; Attard, Judie; Brennan, Rob; O'Sullivan, Declan
Is the LOD cloud at risk of becoming a museum for datasets? Looking ahead towards a fully collaborative and sustainable LOD cloud.
(2019)
Debattista, Jeremy; Attard, Judie; Brennan, Rob; O'Sullivan, Declan
Abstract:
The Linked Open Data (LOD) cloud has been around since 2007. Throughout the years, this prominent depiction served as the epitome for Linked Data and acted as a starting point for many. In this article we perform a number of experiments on the dataset metadata provided by the LOD cloud, in order to understand better whether the current visualised datasets are accessible and with an open license. Furthermore, we perform quality assessment of 17 metrics over accessible datasets that are part of the LOD cloud. These experiments were compared with previous experiments performed on older versions of the LOD cloud. The results showed that there was no improvement on previously identified problems. Based on our findings, we therefore propose a strategy and architecture for a potential collaborative and sustainable LOD cloud.
http://hdl.handle.net/2262/90719
Marked
Mark
Issues with data quality for wind turbine condition monitoring and reliability analyses
(2019)
Leahy, Kevin; Gallagher, Colm V.; O'Donovan, Peter; O'Sullivan, Dominic T. J.
Issues with data quality for wind turbine condition monitoring and reliability analyses
(2019)
Leahy, Kevin; Gallagher, Colm V.; O'Donovan, Peter; O'Sullivan, Dominic T. J.
Abstract:
In order to remain competitive, wind turbines must be reliable machines with efficient and effective maintenance strategies. However, thus far, wind turbine reliability information has been closely guarded by the original equipment manufacturers (OEMs), and turbine reliability studies often rely on data that are not always in a usable or consistent format. In addition, issues with turbine maintenance logs and alarm system data can make it hard to identify historical periods of faulty operation. This means that building new and effective data-driven condition monitoring techniques and methods can be challenging, especially those that rely on supervisory control and data acquisition (SCADA) system data. Such data are rarely standardised, resulting in challenges for researchers in contextualising these data. This work aims to summarise some of the issues seen in previous studies, highlighting the common problems seen by researchers working in the areas of condition monitoring and relia...
http://hdl.handle.net/10468/7903
Marked
Mark
Measuring accuracy of triples in knowledge graphs
(2017)
Liu, Shuangyan; d’Aquin, Mathieu; Motta, Enrico
Measuring accuracy of triples in knowledge graphs
(2017)
Liu, Shuangyan; d’Aquin, Mathieu; Motta, Enrico
Abstract:
An increasing amount of large-scale knowledge graphs have been constructed in recent years. Those graphs are often created from text-based extraction, which could be very noisy. So far, cleaning knowledge graphs are often carried out by human experts and thus very inef- ficient. It is necessary to explore automatic methods for identifying and eliminating erroneous information. In order to achieve this, previous approaches primarily rely on internal information i.e.the knowledge graph itself. In this paper, we introduce an automatic approach, Triples Accuracy Assessment (TAA), for validating RDF triples (source triples) in a knowledge graph by finding consensus of matched triples (among target triples) from other knowledge graphs. TAA uses knowledge graph interlinks to find identical resources and apply di↵erent matching methods between the predicates of source triples and target triples. Then based on the matched triples, TAA calculates a confidence score to indicate the correctness...
http://hdl.handle.net/10379/6892
Marked
Mark
Multi-century trends to wetter winters and drier summers in the England and Wales precipitation series explained by observational and sampling bias in early records
(2019)
Murphy, Conor; Wilby, Robert L.; Matthews, Tom K.R.; Thorne, Peter; Broderick, Ciaran; ...
Multi-century trends to wetter winters and drier summers in the England and Wales precipitation series explained by observational and sampling bias in early records
(2019)
Murphy, Conor; Wilby, Robert L.; Matthews, Tom K.R.; Thorne, Peter; Broderick, Ciaran; Fealy, Rowan; Hall, Julia; Harrigan, Shaun; Jones, Phil D.; McCarthy, Gerard; MacDonald, Neil; Noone, Simon; Ryan, Ciara
Abstract:
Globally, few precipitation records extend to the 18th century. The England Wales Precipitation (EWP) series is a notable exception with continuous monthly records from 1766. EWP has found widespread use across diverse fields of research including trend detection, evaluation of climate model simulations, as a proxy for mid-latitude atmospheric circulation, a predictor in long-term European gridded precipitation data sets, the assessment of drought and extremes, tree-ring reconstructions and as a benchmark for other regional series. A key finding from EWP has been the multi-centennial trends towards wetter winters and drier summers. We statistically reconstruct seasonal EWP using independent, quality-assured temperature, pressure and circulation indices. Using a sleet and snow series for the UK derived by Profs. Gordon Manley and Elizabeth Shaw to examine winter reconstructions, we show that precipitation totals for pre-1870 winters are likely biased low due to gauge under-catch of s...
http://mural.maynoothuniversity.ie/10972/
Marked
Mark
On learnability of constraints from RDF data
(2016)
Muñoz, Emir
On learnability of constraints from RDF data
(2016)
Muñoz, Emir
Abstract:
RDF is structured, dynamic, and schemaless data, which enables a big deal of flexibility for Linked Data to be available in an open environment such as the Web. However, for RDF data, flexibility turns out to be the source of many data quality and knowledge representation issues. Tasks such as assessing data quality in RDF require a different set of techniques and tools compared to other data models. Furthermore, since the use of existing schema, ontology and constraint languages is not mandatory, there is always room for misunderstanding the structure of the data. Neglecting this problem can represent a threat to the widespread use and adoption of RDF and Linked Data. Users should be able to learn the characteristics of RDF data in order to determine its fitness for a given use case, for example. For that purpose, in this doctoral research, we propose the use of constraints to inform users about characteristics that RDF data naturally exhibits, in cases where ontologies (or any oth...
http://hdl.handle.net/10379/6014
Marked
Mark
The Social Media Perception and Reality-Possible Data Quality Deficiencies between Social Media and ERP.
(2019)
Popescu, Mirona Ana-Maria; Ge, Mouzhi; Helfert, Markus
The Social Media Perception and Reality-Possible Data Quality Deficiencies between Social Media and ERP.
(2019)
Popescu, Mirona Ana-Maria; Ge, Mouzhi; Helfert, Markus
Abstract:
With the increase of digitalisation, data in social media are often seen as more updated and realistic than the information system representations. Due to the fast changes in the real world and the increasing Big Social media data, there is usually certain misalignment between the social media and information system in the enterprise such as ERP, therefore there can be data deficiencies or data quality problems in the information systems, which is caused by the differences between the external social media and internal information system. In this paper, underpinned by the work of ontological data quality from Wang and Wand 1996, we investigate a set of data quality problems between two representations - Social Media and ERP. We further discuss how ERP system can be improved from the data quality perspective.
http://mural.maynoothuniversity.ie/13438/
Displaying Results 1 - 22 of 22 on page 1 of 1
Bibtex
CSV
EndNote
RefWorks
RIS
XML
Institution
Dublin City University (5)
Dublin Institute of Technology (1)
Maynooth University (4)
NUI Galway (3)
Trinity College Dublin (3)
University College Cork (2)
University College Dublin (4)
Item Type
Book chapter (1)
Conference item (5)
Journal article (7)
Master thesis (taught) (1)
Working paper (1)
Other (7)
Peer Review Status
Peer-reviewed (9)
Non-peer-reviewed (2)
Unknown (11)
Year
2019 (6)
2018 (4)
2017 (4)
2016 (2)
2015 (1)
2014 (1)
2013 (1)
2011 (1)
2010 (1)
2002 (1)
built by Enovation Solutions