Paradata file format

December 28, 2018

Every survey collected in Survey Solutions is supplemented with a paradata file. No actions are needed from the questionnaire designers or headquarters team to collect it, it is produced automatically by the Survey Solutions software.

The paradata files describe the process of data collection. They explain how the data was entered, detailing all edits, who and when undertaken them. These files may be large and most conveniently processed using specialized statistical packages, rather than general purpose tools.

The paradata is supplied in a zip archive with a tab-delimited data file inside. Each line of these files correspond to one recorded event. These files contain the following columns:

paradata.tab

Variable
Type
Meaning
Example
interview__id
string
32-hexadecimal ID of the interview affected by the event. 75efdc0456fb4b35be4690bd19eab870
order
numeric integer
Numeric sequential ID of the event (starts from 1 for every interview). 1
event
string
Type of the event that has been recorded. See below for possible values. AnswerSet
responsible
string
Login name of the person responsible for the event Enumerator25
role
string
Role of the person mentioned in the ‘responsible’ column, one of the following: Interviewer, Supervisor, Headquarter Interviewer
timestamp
string
Date and time when the event occurred combined in a single timestamp. 2018-12-28T23:00:59
offset
string
Time offset relative to UTC. -05:00:00
parameters
string
One or more parameters of the event, the interpretation of which depends on the type of event. GPSLOC||16.73526463,75.93207878[13]27||2.0

The following table outlines the types of the events recorded in the paradata and the interpretation of the parameters column corresponding to events of this type.

paradata.tab

Event name (alphabetical)
Meaning
Parameters
AnswerRemoved Question’s answer was removed (cleared). QuestionGUID||OptionalRosterAddress
AnswerSet Question was answered in the interview. varname||value||OptionalRosterAddress

Values are mostly same as they are present in the tab-delimited export files, with a few exceptions where the value in the tab-delimited file is split among multiple columns.

Values of multiselect questions are recorded as codes of selected items separated by commas: 323.0, 315.0, 147.0

Values of text list questions are recorded as specified items separated by the |-character: Sergiy|Maryna|Natalia

Values of GPS questions are represented in the form latitude,longitude[accuracy]altitude, such as 16.73526463,75.93207878[13]27
ApproveByHeadquarter Indicates when the interview was approved by an HQ user. Comment entered by the HQ user during approval.
ApproveBySupervisor Indicates when the supervisor approved the interview. Comment entered by the supervisor during approval.
ClosedBySupervisor Indicates when the supervisor closed the interview opened for a review. NO PARAMETERS
CommentSet Occurs when a comment was written to a question in the interview. varname||comment - if the question is not in any roster
varname||comment||OptionalRosterAddress - if the question is part of a roster.
Completed Indicates when the interview was marked as completed by the interviewer. Comment entered by the interviewer at completion.
InterviewerAssigned Event that occurs when the interviewer becomes responsible for the interview (for example, when the interview is created from an assignment). Name of the interviewer that became responsible for this interview.
KeyAssigned Newly created interview is assigned an interview key. Interview key in the form NN-NN-NN-NN
OpenedBySupervisor Indicates when the supervisor opened the interview for a review. NO PARAMETERS
Paused Indicates a prolonged pause during the interviewing process, such as when the tablet goes into the sleep mode to conserve power. NO PARAMETERS
QuestionDeclaredInvalid Event corresponding to the situation when the value of the question deemed to be invalid (not passing the specified validation). varname||OptionalRosterAddress
QuestionDeclaredValid Event corresponding to the value of the question deemed to be valid (passing the specified validation). varname||OptionalRosterAddress
ReceivedByInterviewer Indicates reception of the rejected interview by the interviewer on the tablet. NO PARAMETERS
ReceivedBySupervisor Indicates when the completed interview was received by the supervisor. NO PARAMETERS
RejectedBySupervisor Occurs when an interview is rejected by the supervisor. Comment written by the supervisor when the interview was rejected.
Resumed Indicates resuming work on the interview, such as when the tablet wakes up after going to a sleep mode. NO PARAMETERS
SupervisorAssigned Newly created interview is assigned as responsibility to the team of the interviewer, which started the interview. NO PARAMETERS
UnapproveByHeadquarters Occurs when the interview was unapproved by the HQ (or an admin) user. Comment (if provided) when the interview was unapproved. Typically the automatic message “[Approved by Headquarters was revoked]”.
VariableDisabled Occurs when a variable is disabled (when it is part of a section which gets disabled). varname||value||OptionalRosterAddress
VariableSet Occurs when a variable is recalculated. varname||value||OptionalRosterAddress

OptionalRosterAddress denotes one or more numeric rowcodes for each level of nesting when the event affects an item (question, variable, etc) in a roster. In case of multiple rowcodes they are separated by commas. For example, 2.0,5.0,0.0 may correspond to the job coded 0, of the person with rowcode 5 of the household with rowcode 2. If the item is not part of any roster, it’s OptionalRosterAddress is blank.

Varname is the name of the data variable corresponding to a question or a calculated variable (as specified in the Questionnaire Designer).