How to check Data Details
The following steps show how to check the details of data that have been uploaded to a Dataset.
The Data Details panel is used to view and edit settings for data uploaded via the Data Files page. The settings in the Data Details panel should be checked before the Dataset is published.
- Log in to the RDA
Default view of RDA user interface
- Click Projects & Datasets
Projects & Datasets menu item
The Projects & Datasets section is selected by default.
- Select a Project from the left-hand navigation menu
Left-hand navigation menu showing available Projects
- Click
Sample unpublished Dataset
- Click Data Files
Data Files menu item
- Click Data Details
Sample data file upload
The Data Import Settings panel is displayed.
- Edit the settings for each field. For more information about each field, see Data Details
Data Import Settings
- Personal Identifiable Data (PID) Template
- Choose a template that matches the Personal Identifiable Data in the uploaded data
- Distribution Column
- This allows data to be split among the partitions when data is uploaded to SAIL. There should be no need to change the default setting
- Field Name
- Automatically generated names of fields in the uploaded data. Any spaces in the data are replaced with underscores. The field names can be adjusted, but this should not be necessary
- Friendly Name
- A meaningful name for the field, which will be useful to a user who is not familiar with the data
- Field Description
- A short description of the data contained within the field
- Personal Identifiable Data (PID) Type
- The type of Personal Identifiable Data (if any) in the field. A Personal Identifiable Data Template must be chosen before this field can be edited
- Field Type
- An automated assessment of the type of data in the field, e.g. CHAR for alphanumeric data of a defined size
- DQ Validation Rules
- Add validation rules to confirm that the data is imported correctly More info
- NONE – Default if no DQ rule is given
- Range – If the data should fall between two values, then specify the Min and Max values and this field's data will be validated during publish. Validation errors will be reported in the Data Quality Report. For numeric, date, time and datetime data types only
- Local Lookup – validate all the data of selected field against a temporary lookup table. All the valid values for the data field should be added in the local lookup table section. Any value of the field that cannot be found in the table is marked as invalid in the Data Quality Report. This option is recommended for use if a small number of valid values are dedicated to a certain field
- Reference Table – validate all the data of the selected field against certain values in a database lookup table by specifying the lookup table dataset name, lookup table name, and the lookup column name. Those values from the DB lookup table should contain all the valid values for that data field. Any value of the field that cannot be found in the lookup table is marked as invalid in the Data Quality Report
- Primary Key
- Tick this box to show that the field contains a unique identifier, e.g. an email address. More than one field may be marked as a Primary Key
- Show in Data Quality Report
- Tick this box to include the field in the Data Quality Report. Fields that are not relevant should be excluded
- Bookmark
- Tick this box to mark the field as a bookmark
- Click this button to save the changes and validate the data. The Pre-Publish process cannot proceed until all data has been validated
- Click this button to save the changes and close the Data Details panel without attempting to validate the data
- Click to validate the data, or click to save without validating the data