API: R-package

February 25, 2020

Introduction to the Survey Solutions API R-package

Why this package

The World Bank’s Computer Assisted Survey System (CASS) Survey Solutions is a powerful tool for high quality data collection and used in many statistical offices around the world. Besides the standard GUI, it also comes with a powerful REST API.

To further facilitate its integration in a fully automated workflow we have developed this R package, which covers almost all of the available API calls.

This article gives you an overview of the functions implemented in this package.

The package can be considered as being a API “bare bone”, as it implements the Survey Solutions API calls without substantial modification. However, this is not true for the data export function as well as for the paradata. Modification was necessary to facilitate workflow integration. See the details in the corresponding documentation.

Please be aware, that this package makes heavy use of data.table, and this is also continued in this vignette. In case you are not familiar with it yet, refer to this introduction.

API credentials

To use the API you first need to set-up the API user on your Survey Solutions server. See here for details. After this done, you can use the suso_set_key() function, to provide your credentials.

library(SurveySolutionsAPI)
suso_clear_keys()
suso_set_key("https://xxx.mysurvey.solutions", "xxxxxx", "xxxxxxx")
suso_keys()
#> $suso
#> $suso$susoServer
#> [1] "https://xxx.mysurvey.solutions"
#> 
#> $suso$susoUser
#> [1] "xxxxxx"
#> 
#> $suso$susoPass
#> [1] "xxxxxxx"
#> 
#> 
#> attr(,"class")
#> [1] "suso_api"

After this is done, there is no need to provide the credentials for every API call again. They are stored until the end of your current session. There is also a function to check if your credentials work.

suso_PwCheck()
#> Response [https://mcw2.mysurvey.solutions/api/v1/supervisors?limit=200]
#>   Date: 2020-02-25 02:00
#>   Status: 200
#>   Content-Type: application/json; charset=utf-8
#>   Size: 471 B

It calls the supervisor API, and returns the response. If the return value is 200, then your connection and credentials are OK.

User Management

These functions are particularly useful for survey management, and more details can be found in the corresponding vignette on survey management. Lets start with getting a list of all supervisors on the server.

sv <- suso_getSV()
print(sv)
#>    IsLocked        CreationDate                               UserId      UserName
#> 1:    FALSE 2019-12-12 06:17:59 4a5be68d-e689-4cd6-83b5-32936a02a8f0  Alex_Sup_Rom
#> 2:    FALSE 2019-12-12 09:37:51 4f3f6d63-90aa-4861-a59a-fb2907135618 Alex_Sup_Rom2
#> 3:    FALSE 2019-07-26 03:15:42 8f01a8a6-4f7c-49f1-a276-2af4ea7636bb     Team_2101

Your will receive a list of supervisors currently active (not archived) on the server. If you take one of the supervisor’s id and use the suso_getINT function, you will receive all the interviewers in the team.

int <- suso_getINT(sv_id = sv[3, UserId])
print(int)
#>    IsLocked        CreationDate         DeviceId                               UserId  UserName
#> 1:    FALSE 2019-07-26 03:16:47 5ca9d1e76d855a78 0d169565-df66-48b4-8c93-3375e21de136 Int1_2101
#> 2:    FALSE 2019-07-26 03:20:42 58ded0cc5de671a2 5ac32a2d-20c5-42b6-a30c-89367d9f65d2 Int2_2101
#> 3:    FALSE 2019-08-09 16:42:47 aab82d9709439494 d1e57346-b13e-47e8-adc1-aaf647711f47 int4_2101

To receive more information about a particular user, use the suso_getINT_info function:

intinfo <- suso_getINT_info(int_id = int[1, UserId])
print(intinfo)
#>    SupervisorName                         SupervisorId IsLockedBySupervisor IsLockedByHeadquarters
#> 1:      Team_2101 8f01a8a6-4f7c-49f1-a276-2af4ea7636bb                FALSE                  FALSE
#>    IsArchived                               UserId  UserName       Roles IsLocked        CreationDate
#> 1:      FALSE 0d169565-df66-48b4-8c93-3375e21de136 Int1_2101 Interviewer    FALSE 2019-07-26 03:16:47

To get information about any particular user, you can also use the more general function suso_getUSR.

usrinfo_int <- suso_getUSR(uid = sv[3, UserId])
print(usrinfo_int)
#>    IsArchived                               UserId  UserName      Roles IsLocked        CreationDate
#> 1:      FALSE 8f01a8a6-4f7c-49f1-a276-2af4ea7636bb Team_2101 Supervisor    FALSE 2019-07-26 03:15:42

usrinfo_int <- suso_getUSR(uid = int[1, UserId])
print(usrinfo_int)
#>    IsArchived                               UserId  UserName       Roles IsLocked        CreationDate
#> 1:      FALSE 0d169565-df66-48b4-8c93-3375e21de136 Int1_2101 Interviewer    FALSE 2019-07-26 03:16:47

Questionnaire

The basic questionnaire API calls are handled through the suso_getQuestDetails function.

If no input is provided, the function returns a list of all questionnaires on the server:

questlist <- suso_getQuestDetails()
# print(questlist)

Specifying operation.type = status, you receive a list of statuses.

statlist <- suso_getQuestDetails(operation.type = "statuses")
print(statlist)
#>  [1] "Restored"               "Created"                "SupervisorAssigned"     "InterviewerAssigned"   
#>  [5] "RejectedBySupervisor"   "ReadyForInterview"      "SentToCapi"             "Restarted"             
#>  [9] "Completed"              "ApprovedBySupervisor"   "RejectedByHeadquarters" "ApprovedByHeadquarters"
#> [13] "Deleted"

By taking a particular QuestionnaireId and specifying the *operation.type *you can execute further requests. For example,

questionnaire <- suso_getQuestDetails(operation.type = "structure", quid = questlist[2, QuestionnaireId], version = questlist[2, 
    Version])
questionnaire <- questionnaire[, .(VariableName, type, QuestionText, Featured, PublicKey)]
questionnaire <- questionnaire[!is.na(QuestionText)]
head(questionnaire, 19L)
#>           VariableName                  type
#>  1:   Isl_SupEnum_Dist          TextQuestion
#>  2:          gps_start GpsCoordinateQuestion
#>  3:         date_start      DateTimeQuestion
#>  4:            mainMap          AreaQuestion
#>  5:             id_str      TextListQuestion
#>  6:           date_end      DateTimeQuestion
#>  7:            gps_end GpsCoordinateQuestion
#>  8:          gps_struc GpsCoordinateQuestion
#>  9:            address          TextQuestion
#> 10:             id_dwe      TextListQuestion
#> 11:      Desc_Dwelling          TextQuestion
#> 12:          Bldg_Name          TextQuestion
#> 13: HL1a_PrivDwel_Inst        SingleQuestion
#> 14:   HL1b_InstOn_Prop        SingleQuestion
#> 15:     HL1c_Dwel_Name          TextQuestion
#> 16:       H1_Typ_Dwell        SingleQuestion
#> 17:    HL1a1_Inst_Name          TextQuestion
#> 18:  HL1a2_Type_Instit        SingleQuestion
#> 19:        V1_Vac_Dwel        SingleQuestion
#>                                                                                                                                                                          QuestionText
#>  1:                                                                                                                                  Island-Supervisory District-Enumeration District
#>  2:                                                                                                           I01. <big>GPS</big> of the <big>START</big> of the Enumeration District
#>  3:                                                                                                  I02. <big>DATE and TIME</big> at the <big>START</big> of Census data collection.
#>  4:                                                                                                                       Please mark all the buildings in your Enumeration District!
#>  5:                                                                                                                                                   Please give the BUILDING number
#>  6:                                                                                                            Q14. <big>DATE AND TIME</big> OF THE <big>END</big> OF THE ENUMERATION
#>  7:                                                                                                                      Q13. <big>GPS</big> AT THE <big>END</big> OF THE ENUMERATION
#>  8:                                                                                                                      B1. GPS of <font color="blue">BUILDING NR. %str_list%</font>
#>  9:                                                                                                                  B2. Address of <font color="blue">BUILDING NR. %str_list%</font>
#> 10:                                                                                            Serial Number of the DWELLING inside <font color="blue">BUILDING NR. %str_list%</font>
#> 11: Please provide brief description of <font color="green">DWELLING NR. %dwe_list%</font>. (Apt.No., colour, other physical attributes fencing/gates, two-story,duplex.tri-plex etc)
#> 12:                      Please state building name and floor number for <font color="green">DWELLING NR. %dwe_list%</font>: (For highrise buildings only e.g Bayroc, Lucayan Towers)
#> 13:                                                             HL1a. Is <font color="green">DWELLING NR. %dwe_list%</font> a private dwelling, an institution, or a vacant dwelling?
#> 14:                                                               HL1b. Is this private <font color="green">DWELLING NR. %dwe_list%</font> located on the property of an institution?
#> 15:                                                                         HL1c. Give the name of this institution <font color="green">DWELLING NR. %dwe_list%</font> is located on.
#> 16:                                                                                                              H1. What type is <font color="green">DWELLING NR. %dwe_list%</font>?
#> 17:                                                                                                                                         HL1a1. Give the name of this institution.
#> 18:                                                                                                                                         HL1a2. Which type of institution is this?
#> 19:                                                                                        V1. Indicate the status of this vacant <font color="green">DWELLING NR. %dwe_list%</font>.
#>     Featured                            PublicKey
#>  1:     TRUE 9c1df13b-ebd9-c948-c1f6-3d494baed6c7
#>  2:    FALSE a28e8db7-6c8d-41ac-b21f-5fd14263c202
#>  3:    FALSE 72546856-f6eb-4808-99c9-e5c3b8be7f52
#>  4:    FALSE 2a2e758d-6ffb-269a-d713-0727cef6a26c
#>  5:    FALSE 16141bf7-7953-4094-83f0-4a23b0c4af23
#>  6:    FALSE d755aa6a-dae2-4cae-a236-6c916fa6beb6
#>  7:    FALSE 2901b3f6-4eb8-4de9-82dc-fe2e4ceaea10
#>  8:    FALSE 27164c96-3beb-45a1-aa72-82e7f738bf9e
#>  9:    FALSE 3db0373c-324f-a413-de79-55d9fc7d5d5e
#> 10:    FALSE 7753608c-d771-42af-822e-7132e6a5bd61
#> 11:    FALSE a0e8659f-3059-66e6-3f54-f7b1736b4632
#> 12:    FALSE 8c84a470-f87d-bf7d-756b-9d37e5f49f77
#> 13:    FALSE 038e98bc-1b23-076b-23de-52b8114cb69f
#> 14:    FALSE d96bd046-c7be-a174-f52d-ca5dd0b2c74e
#> 15:    FALSE 2458d797-f5a6-d4c7-52dd-020bbc88bf88
#> 16:    FALSE 42ac223f-d5f9-6ff6-a2c2-af4ab55aa446
#> 17:    FALSE eb84fde3-e67f-5511-3d90-7de54766af8c
#> 18:    FALSE 4116e001-ba0d-6ed7-1878-ccc689079874
#> 19:    FALSE f8922061-4b57-a236-cf67-60e4adafdf73

Gives you a data.table which contains all the questions, question texts, etc. which you can use for further processing i.e to render a user manual with rmarkdown. Find details in the manual on questionnaire creation.

You can also get a list of all interviews done for the specific questionnaire.

interviews <- suso_getQuestDetails(operation.type = "interviews", quid = questlist[2, QuestionnaireId], version = questlist[2, 
    Version])
interviews <- interviews[, .(InterviewId, AssignmentId, ResponsibleId, ErrorsCount, Status)]
head(interviews, 20L)
#>                              InterviewId AssignmentId                        ResponsibleId ErrorsCount
#>  1: 0870df14-5b3a-431c-a2d9-d59a102d230f          404 d1e57346-b13e-47e8-adc1-aaf647711f47           0
#>  2: e1066cbd-2c04-414d-922c-cc93c9fe333f          404 d1e57346-b13e-47e8-adc1-aaf647711f47           0
#>  3: c84ba723-e19d-48a4-9447-176b22571a5c          267 b1aaf473-14fe-4613-9c66-3185b69b1d11           0
#>  4: 800892cd-bcfd-4667-98dc-8aedee09215b          266 5ac32a2d-20c5-42b6-a30c-89367d9f65d2           0
#>  5: 59547fd9-51ed-4f20-aec7-0b94bcbf3495          267 b1aaf473-14fe-4613-9c66-3185b69b1d11           0
#>  6: 4e33a10a-8e60-4695-8fdf-0ea7dd69505c          404 d1e57346-b13e-47e8-adc1-aaf647711f47           0
#>  7: 7064a70e-2d86-4236-a84b-ad8656d76491          404 d1e57346-b13e-47e8-adc1-aaf647711f47           0
#>  8: 32f6e155-2493-46ea-81ab-c089fb51ff09          266 5ac32a2d-20c5-42b6-a30c-89367d9f65d2           0
#>  9: a8130651-8e45-40d3-9097-246cb20ec867          266 5ac32a2d-20c5-42b6-a30c-89367d9f65d2           0
#> 10: 1fd1fa02-dcf5-464e-ac47-9fcfa56be6be          371 0d169565-df66-48b4-8c93-3375e21de136           0
#> 11: 61e7c5f4-17fa-4b18-92a7-4c3fb8a92867          371 0d169565-df66-48b4-8c93-3375e21de136           0
#> 12: cb422fe2-cafd-4dc2-8f6f-2f9f416e0bb7          371 0d169565-df66-48b4-8c93-3375e21de136           0
#> 13: c4ffdc75-f5a7-4db0-b405-38c42625cb0e          371 0d169565-df66-48b4-8c93-3375e21de136           0
#> 14: 424b668c-ddb3-4d9a-beb3-6609cf46415b          371 0d169565-df66-48b4-8c93-3375e21de136           0
#> 15: b846cc1b-64bc-4c71-9c27-872b1dee8456          266 5ac32a2d-20c5-42b6-a30c-89367d9f65d2           0
#> 16: 16ad3f3b-0a93-42be-b921-d4e68968d715          266 5ac32a2d-20c5-42b6-a30c-89367d9f65d2           0
#> 17: d7f355a6-d213-4373-bc46-debe4044d71c          266 5ac32a2d-20c5-42b6-a30c-89367d9f65d2           0
#> 18: 18a6a7eb-30c9-41e2-a68e-7cdba9a6df17          371 0d169565-df66-48b4-8c93-3375e21de136           0
#> 19: 35d66617-bd38-48e5-9ce8-ea185aa89b4e          371 0d169565-df66-48b4-8c93-3375e21de136           0
#> 20: b2b83f26-e20e-44c8-9fc7-2ddba0fe52e9          353 0d169565-df66-48b4-8c93-3375e21de136           0
#>                  Status
#>  1:           Completed
#>  2: InterviewerAssigned
#>  3:           Completed
#>  4:           Completed
#>  5:           Completed
#>  6:           Completed
#>  7:           Completed
#>  8:           Completed
#>  9:           Completed
#> 10:           Completed
#> 11:           Completed
#> 12:           Completed
#> 13:           Completed
#> 14:           Completed
#> 15:           Completed
#> 16:           Completed
#> 17:           Completed
#> 18:           Completed
#> 19:           Completed
#> 20:           Completed

Quick statistics

To monitor variables of interest, you can use the suso_get_stats function.

statquest <- suso_get_stats(questID = questlist[2, QuestionnaireId], version = questlist[2, Version], qQuest = questionnaire[13, 
    PublicKey])
print(statquest)
#>        TEAMS      TEAM MEMBER Private dwelling Institution Vacant dwelling Abandoned/Dilapidated Total
#> 1: All teams All interviewers               90          10               6                    31   137
#> 2: Team_2101        Int1_2101               51           2               2                    29    84
#> 3: Team_2101        Int2_2101               17           3               2                     0    22
#> 4: Team_2101        Int3_2101               12           3               2                     1    18
#> 5: Team_2101        int4_2101               10           2               0                     1    13

Full data export

To export the data collected in Survey Solutions, you use suso_export.

#> The last file has been created 5.534 hours ago.[1] "assignment__actions"
#> [1] 0
#> [1] "****"
#> [1] "dwe_list"
#> [1] 2
#> [1] "****"
#> [1] "interview__actions"
#> [1] 0
#> [1] "****"
#> [1] "interview__diagnostics"
#> [1] 0
#> [1] "****"
#> [1] "interview__errors"
#> [1] 0
#> [1] "****"
#> [1] "str_list"
#> [1] 1
#> [1] "****"

Its return value is a list with the following elememts: main, R1, R2, R3, with - main containing the files: BAH_MiniPilot, interview__comments - R1 containing all rosters at the first level - R2 containing all rosters at the second level - R3 containing all rosters at the third level

through the harmonized ID, main and rosterfiles can easily be put together. More on this in the specific vignette.

Paradata

To retrieve the paradata for a particular interview you use suso_export_paradata

system.time(para1 <- suso_export_paradata(questID = questlist[2, QuestionnaireId], version = questlist[2, Version], 
    reloadTimeDiff = 24, onlyActiveEvents = F, allResponses = T))
#> 
#> The last file has been created 5 hours ago.
#> 
#> Starting download & file extraction. 
#> 
#> 
#> Calculating Response Timings.
#> 
#> Extracting GPS variable.
#> Processing: 
#>  AnswerSet
#> 
#>  AnswerRemoved
#> 
#>  ApproveByHeadquarter
#> 
#>  Restarted
#> 
#>  Reject
#> 
#>  QuestionDeclaredInvalid
#> 
#>  QuestionDeclaredValid
#> 
#> Export & Transformation finished.
#>    user  system elapsed 
#>   1.244   0.028   0.510

This will return a list of data table, separated by events. Please bear in mind, that paradata files can be fairly big, and processing it may require a large working memory. To decrease the load there are also 3 parameters you can change, these are:

  • onlyActiveEvents, if TRUE it processes only the active events initiated by users.
  • allResponses, if FALSE, does not process all response values, nevertheless they are still included in a single column. Otherwise, they are separated by column.

Running the same call again without passive events and without all responses processed, reduces processing time significantly.

system.time(para2 <- suso_export_paradata(questID = questlist[2, QuestionnaireId], version = questlist[2, Version], 
    reloadTimeDiff = 24, onlyActiveEvents = T, allResponses = F))
#> 
#> The last file has been created 5 hours ago.
#> 
#> Starting download & file extraction. 
#> 
#> 
#> Calculating Response Timings.
#> 
#> Extracting GPS variable.
#> Processing: 
#>  AnswerSet
#> 
#>  AnswerRemoved
#> 
#>  ApproveByHeadquarter
#> 
#>  Restarted
#> 
#>  Reject
#> 
#> Export & Transformation finished.
#>    user  system elapsed 
#>   0.908   0.004   0.357

As you see from the system timings, changing these parameters reduces processing time significantly. More details on how to work with paradata can be found in the corresponding vignette. The paradata export is returned as a list, with the following elements:

  • full data: KeyAssigned, CommentSet, Completed, AnswerSet, AnswerRemoved, Restarted, Reject, QuestionDeclaredInvalid, QuestionDeclaredValid, actionDistr, userDistr, roleDistr
  • reduced data: KeyAssigned, CommentSet, Completed, AnswerSet, AnswerRemoved, Restarted, Reject, actionDistr, userDistr, roleDistr

There are also tables already included in the file, like:

para2[["userDistr"]]
#>      responsible count
#> 1:                3460
#> 2:     Int1_2101  1025
#> 3:     Int2_2101   305
#> 4:     Int3_2101   250
#> 5:     int4_2101   146
#> 6: bahamaAPI0202    18
#> 7:         admin     5

which gives the distribution of events by user, or:

para2[["actionDistr"]]
#>                      action count
#>  1:   QuestionDeclaredValid  2574
#>  2:               AnswerSet  1482
#>  3:             VariableSet   525
#>  4:                 Resumed   150
#>  5:                  Paused   120
#>  6:               Completed    69
#>  7:    ReceivedBySupervisor    65
#>  8:     InterviewerAssigned    47
#>  9:             KeyAssigned    45
#> 10:      SupervisorAssigned    45
#> 11:   ReceivedByInterviewer    24
#> 12:    RejectedBySupervisor    21
#> 13:           AnswerRemoved    19
#> 14: QuestionDeclaredInvalid    15
#> 15:               Restarted     4
#> 16:              CommentSet     4

which gives the distribution by event type.

We hope that gave you a short overview on the available functions. For more details on how to use the output, please read the specific vignettes.