Data quality
Paper instructions:
Paper instructions:
1. Data quality can be assessed in terms of accuracy, completeness, and consistency.
Propose two other dimensions of data quality.
2. How is a quantile-quantile plot different from a quantile plot?
3. In real-world data, tuples with missing values for some attributes are a common occurrence.
Describe various methods for handling this problem.
4. Using the data for age given in Exercise2 answer the following:
a. Use smoothing by bin means to smooth the above data, using a bin depth of 3. Illustrate your
steps.
b. Comment on the effect of this technique for the given data.
c. How might you determine outliers in the data?
d. What other methods are there for data smoothing
5. Robust data loading poses a challenge in database systems because the input data are often dirty.
In many cases, an input record may have several missing values and some records could be
contaminated (i.e., with some data values out of range or of a different data type than expected). Work
out an automated data cleaning and loading algorithm so that the erroneous data will be marked and
contaminated data will not be mistakenly inserted into the database during data loading.
PLACE THIS ORDER OR A SIMILAR ORDER WITH US TODAY AND GET AN AMAZING DISCOUNT 🙂

+1 862 207 3288 