Paradata file format

January 30, 2024

Every survey collected in Survey Solutions is supplemented with a paradata file. No actions are needed from the questionnaire designers or headquarters team to collect it, it is produced automatically by the Survey Solutions software.

The paradata files describe the process of data collection. They explain how the data was entered, detailing all edits, who and when undertaken them. These files may be large and most conveniently processed using specialized statistical packages, rather than general purpose tools.

The paradata is supplied in a zip archive with a tab-delimited data file and supplementary meta-data files inside.

Contents of paradata export archive

File
Description
paradata.tabParadata in tab-delimited format. Each line of this file corresponds to one recorded event. See below description of the columns of this file.
paradata.doScript for Stata statistical package to import tab-delimited paradata.
Example
export__readme.txtHuman-readable description file (in text format).
Example
export__info.jsonMachine-readable description/identification file (in JSON format).
Example

Columns of paradata.tab file

The paradata events are recorded (along with their attributes) placed in the following columns:

Variable
Type
Meaning
Example
interview__id
string
32-hexadecimal ID (GUID) of the interview affected by the event.75efdc0456fb4b35be4690bd19eab870
order
numeric integer
Numeric sequential ID of the event (starts from 1 for every interview).1
event
string
Type of the event that has been recorded (see coding of event types below).AnswerSet
responsible
string
Login name of the person responsible for the eventEnumerator25
role
numeric
Role of the person mentioned in the responsible column (see coding of roles below)1
timestamp_utc
string
Date and time when the event occurred combined in a single timestamp (in UTC), using the following format:
YYYY-MM-DDThh:mm:ss.msc
2018-12-28T23:00:59.123
tz_offset
string
Time zone offset (relative to UTC).-05:00:00
parameters
string
One or more parameters of the event, the interpretation of which depends on the type of event.GPSLOC||16.73526463,75.93207878[13]27||2.0

Paradata events and associated codes

The following table outlines the types of the events tracked by Survey Solutions. You may encounter some of them (though not all) in the paradata. The table also provides the interpretation of the parameters column corresponding to each type of event. Presence of some of the types of events is dependent on the version of Survey Solutions.

Event name
(alphabetical)
Numeric code
Meaning
Parameters
AnswerRemoved3Question's answer was removed (cleared).varname||OptionalRosterAddress
AnswerSet2Question was answered in the interview.varname||value||OptionalRosterAddress

Values are mostly same as they are present in the tab-delimited export files, with a few exceptions where the value in the tab-delimited file is split among multiple columns.

Values of multiselect questions are recorded as codes of selected items separated by commas: 323.0, 315.0, 147.0

Values of text list questions are recorded as specified items separated by the |-character: Sergiy|Maryna|Natalia

Values of GPS questions are represented in the form latitude,longitude[accuracy]altitude, such as 16.73526463,75.93207878[13]27
ApproveByHeadquarter8Indicates when the interview was approved by an HQ user.Comment entered by the HQ user during approval.
ApproveBySupervisor7Indicates when the supervisor approved the interview.Comment entered by the supervisor during approval.
ClosedBySupervisor30Indicates when the supervisor closed the interview opened for a review.NO PARAMETERS
CommentSet4Occurs when a comment was written to a question in the interview.varname||comment - if the question is not in any roster
varname||comment||OptionalRosterAddress - if the question is part of a roster.
Completed5Indicates when the interview was marked as completed by the interviewer.Comment entered by the interviewer at completion.
Deleted11ReservedNO PARAMETERS.
GroupDisabled16⚠ This event is not included in the exported paradata file.

Event that corresponds to the group (section, subsection) being declared as enabled (to be skipped, not skipped).
NO PARAMETERS.
GroupEnabled15⚠ This event is not included in the exported paradata file.

Event that corresponds to the group (section, subsection) being declared as disabled (to be answered, not skipped).
NO PARAMETERS.
InterviewCreated32Occurs when the interview is created.NO PARAMETERS.
InterviewerAssigned1Event that occurs when the interviewer becomes responsible for the interview (for example, when the interview is created from an assignment).Name of the interviewer that became responsible for this interview.
InterviewModeChanged31Event that occurs when the interview mode is set or changed (for example, when the interviewer switches from CAPI to CAWI).New mode: CAPI or CAWI.
KeyAssigned27Newly created interview is assigned an interview key. Also occurs when a key of the interview is modified tue to a collision with an existing interview's key. Latest event will reflect the current interview key. Event may once OR twice per interview only.Interview key in the form: NN-NN-NN-NN
OpenedBySupervisor29Indicates when the supervisor opened the interview for a review.NO PARAMETERS
Paused25Indicates a prolonged pause during the interviewing process, such as when the tablet goes into the sleep mode to conserve power.NO PARAMETERS
QuestionDeclaredInvalid18Event corresponding to the situation when the value of the question deemed to be invalid (not passing the specified validation).varname||OptionalRosterAddress
QuestionDeclaredValid17Event corresponding to the value of the question deemed to be valid (passing the specified validation).varname||OptionalRosterAddress
QuestionDisabled14⚠ This event is not included in the exported paradata file.

Event corresponding to the question being set to disabled state (question is to be skipped, not answered).
varname||OptionalRosterAddress
QuestionEnabled13⚠ This event is not included in the exported paradata file.

Event corresponding to the question being set to enabled state (question is to be answered, not skipped).
varname||OptionalRosterAddress
ReceivedByInterviewer20Indicates reception of the rejected interview by the interviewer on the tablet.NO PARAMETERS
ReceivedBySupervisor21Indicates when the completed interview was received by the supervisor.NO PARAMETERS
RejectedByHeadquarter10Occurs when an interview is rejected by a headquarters user.Comment written by the HQ user when the interview was rejected.
RejectedBySupervisor9Occurs when an interview is rejected by the supervisor.Comment written by the supervisor when the interview was rejected.
Restarted6Occurs when an interview is restarted on a tablet (from a completed status).NO PARAMETERS
Restored12Reserved.
Not to be confused with 'restarted' above.
NO PARAMETERS
Resumed26Indicates resuming work on the interview, such as when the tablet wakes up after going to a sleep mode.NO PARAMETERS
SupervisorAssigned0Newly created interview is assigned as responsibility to the team of the interviewer, which started the interview.NO PARAMETERS
TranslationSwitched28Occurs when the language (translation) of the interview is switched.Name of the selected language.
UnapproveByHeadquarters19Occurs when the interview was unapproved by the HQ (or an admin) user.Comment (if provided) when the interview was unapproved. Typically the automatic message "[Approved by Headquarters was revoked]".
VariableDisabled24Occurs when a variable is disabled (when it is part of a section which gets disabled).varname||value||OptionalRosterAddress
VariableEnabled23Occurs when a variable is enabled (when it is part of a section which gets enabled).varname||value||OptionalRosterAddress
VariableSet22Occurs when a variable is recalculated.varname||value||OptionalRosterAddress

OptionalRosterAddress denotes one or more numeric rowcodes for each level of nesting when the event affects an item (question, variable, etc) in a roster. In case of multiple rowcodes they are separated by commas. For example, 2.0,5.0,0.0 may correspond to the job coded 0, of the person with rowcode 5 of the household with rowcode 2. If the item is not part of any roster, it's OptionalRosterAddress is blank.

Varname is the name of the data variable corresponding to a question or a calculated variable (as specified in the Questionnaire Designer).

Coding of roles

Numeric codes used to encode the role in the paradata records have the following meaning:

Numeric codeMeaning
0<UNKNOWN ROLE>
1Interviewer
2Supervisor
3Headquarter
4Administrator
5API User