====== API for Data Retrieval ====== SoSci Survey offers an API for data retrieval. Navigate to **Collected Data** -> **API for Data Retrieval** to create an API link. API formats include: * JSON data * View single questionnaire \\ This function may be used to allow third parties to view single questionnairess. Before you create the API link ('+' symbol in the top right corner), you may choose which records (e.g., interview and/or pretest data, interview date, etc.) may be retrieved (and may not be retrieved). ===== JSON data retrieval ===== By default, the API link retrieves all collected data (according to the retrieval criteria defined for the API link). ==== Export ==== The JSON file contains the following attributes. * ''metadata''\\ Metadata, for example data retrieval timestamp, retrieval criteria (''filter''). * ''errors''\\ Potential retrieval errors, for example a non-existing variable. Only exists in the file if any errors occurred. * ''data''\\ The dataset. Each case is again a sub-attribute with the key ''C'' + case number. * ''C213''\\ Single cases (rows in the dataset) are objects with variable names as keys and their values stored as the object value. If there is no value for a variable (e.g., question not shown to participant, no answer), the variable is __not__ listed. * ''variables''\\ Additional information about the variables and their labels. Only exists in the file if the parameters "infoVariables", "infoValues" and/or "infoQuestionText" were added to the API call). { "metadata": { "project": "z2018", "datetime": "2018-08-01 22:35:18", "filter": [ "Only selected cases: CASE 120, 121" ], "language": "ger" }, "data": { "C120": { "CASE": 120, "SERIAL": "H3PVKWVM6H", "QUESTNNR": "short", "MODE": "interview", "STARTED": "2018-03-30 21:16:33", "T109_01": 4, "FINISHED": 1, "Q_VIEWER": 0, "LASTPAGE": 2, "MAXPAGE": 2 }, "C121": { "CASE": 121, "SERIAL": "LSFK1ZX25B", "QUESTNNR": "short", "MODE": "interview", "STARTED": "2018-03-30 21:16:55", "T109_01": 2, "FINISHED": 1, "Q_VIEWER": 0, "LASTPAGE": 2, "MAXPAGE": 2 } } } ==== Parameter ==== All parameters are optional. If no parameters are added to the API link, all cases (as defined in the retrieval criteria) will be retrieved. * ''cases'' -- Cases (CASE) to be retrieved. * You may add ranges (1-100) or single values. Separate multiple values by commas, for example ''1,2,5,10-20''. * The last (!) value may be a range without an endpoint, for example ''101-'' to retrieve all cases starting from CASE 101. * **Note:** If cases are specified that were restricted during the API link creation, only non-restricted cases are retrieved. For example, if the API link is limited to cases ''1-100'' and the parameter ''cases'' is set to the range ''50-150'', cases 50 to 100 are retrieved. * Setting the parameter to ''none'' deactivates data retrieval, for example if only variable information is of interest. * ''SERIAL'' -- Restriction to cases with this/these personal identifier(s) * Multiple SERIALs can be specified separated by commas, e.g. ''ABCD12,BCDE23''. * If both ''cases'' and ''SERIAL'' are specified, only cases that meet both criteria at the same time are retrieved (intersection). * ''vList'' -- Variables to retrieve. * If the parameter ''vList'' is set, only (!) those variables will be retrieved. * Separate multiple variable labels with commas, e.g. ''CASE,AB01_01,AB01_02''. * The parameter ''vSkip'' is ignored if ''vList'' is set. * ''vSkip'' -- Skip single variables. * ''vSkipTime'' -- Skip dwell time (TIME000) and last data timestamp (LASTDATA) variables. * ''vQuality'' -- Add [[:en:results:variables#quality_indicators|quality indicators]] (MISSING, MISSREL, TIME_RSI, DEG_TIME).\\ **Warning:** Some of the quality indicators are normalized on the basis of the retrieved data. This may lead to missing or biased values if not all data is retrieved. * ''vAddress'' -- Add contact info of panel members (email, phone number, UID) (only possible if panel data was imported as [[:en:survey:mailing#privacy_mode|person-related data]]). * ''startMin'' -- Retrieve only cases that started the questionnaire after this date. * Excludes all cases that started the questionnaire before this data (variable: STARTED). * May be set in the formats YYYY-MM-DD or YYYY-MM-DD SS:MM:SS, e.g. ''2018-04-01T16:30:30'' ("T" may be replaced by a single space). * ''startMax'' -- Retrieve only cases that started the questionnaire until this date. * ''changed'' -- Retrieve only cases that changed some data after this date. * ''infoVariables'' -- Get additional variable information (e.g., scaling, input type). * ''infoValues'' -- Get value labels. * ''infoQuestionText'' -- Get question wording. __Examples__: Plain API link: https://www.soscisurvey.de/PROJEKT/?act=uDywDXaYyNEY Retrieve only cases 120 and 121 and skip dwell time variables: https://www.soscisurvey.de/PROJEKT/?act=uDywDXaYyNEY&vSkipTime&cases=120,121 Retrieve only variables STARTED, AB01, AB02 and AB03_01 for cases that changed data after 2018-08-01 12:00:00: https://www.soscisurvey.de/PROJEKT/?act=uDywDXaYyNEY&vList=STARTED,AB01,AB02,AB03_01&changed=2018-08-01T12:00:00 Retrieve only variable labels: https://www.soscisurvey.de/PROJEKT/?act=uDywDXaYyNEY&cases=none&infoValues ===== Data retrieval as CSV ===== CSV (Comma Separated Values) files are compatible with almost any spreadsheet and statistics program. The same [[#parameter|Parameters]] are available as for JSON retrieval. ==== CSV for Excel ==== In "CSV for Excel", the data is optimized for import into Excel. The following parameters allow to customize the format: * ''decimal'' -- Set the decimal separator used. Here the specifications "point" and "comma" are allowed. By default, the decimal separator is set based on the base language of the survey project. * ''missing'' -- Handling of missing values, allowed are "code" (numeric code, default), "stata" (stata codes) and "remove" (remove missing values from the table). * ''encoding'' -- File encoding, allowed are "utf-16" (UTF-16 LE for Excel, default), "utf-8" and "iso-8859-1". ==== CSV for R ==== With "CSV for R" the data are formatted in such a way that they can be read optimally into GNU R. In addition, the parameter ''type=rScript'' can be used to retrieve an R script to import the CSV file. This script either uses local data if the parameter ''csvFile'' is specified or fetches the current data from the server using API. * ''useSettings'' -- Use the settings specified in **Download data** (by default, default settings for CSV formatting are used). * ''rScript'' -- Downloas the r script to import the csv data. * ''csvFile'' -- Use a specific file name in the import script (by default, the data is loaded directly via HTTPS from the survey server). ===== Display of Individual Questionnaires ===== If an API link to view individual questionnaires is called in the browser, SoSci Survey offers an open input field for the number of the desired questionnaire (CASE). After entering the number the print preview is shown. **Note:** If the API link only allows access to a single case, it will be displayed immediately without asking for the number. **Note:** The print view is subject to the same restrictions as when called up via **Collected data** -> **View data** -> Printer icon. Among other things, questions with rotated items in completed interviews use a different (random) order of items than in the interview. ===== Records by Person Code (SERIAL) ===== An API link for listing records by person code must have a ''SERIAL'' parameter added to the call, e.g. https://www.soscisurvey.de/project/?act=sIyhrICdaBxl6qiMagedOU1K&SERIAL=ABC12345 Returned is an object in JSON notation that contains the query status (''result'') and some meta data (''meta'') as well as an array of which records have this person identifier (SERIAL) in the record. { "result":"ok", "meta":{ "project":"project", "datetime":"2020-07-14 16:03:00", "count":1 }, "cases":[ { "CASE":"145", "SERIAL":"ABC12345", "FINISHED":true } ] } The meta-information ''count'' refers to the number of matching records. Only those records that meet the criteria specified when the API link was created will be searched. ===== Reference ===== ==== JSON ==== The variables in the JSON-Download include the description ("label") and the parameters "type" and "input". The following values are possible. * ''type'' -- type of variable/value range of the variable * ''%%'BOOL'%%'' -- a boolean value (''true''/''false'') * ''%%'DICHOTOMOUS'%%'' -- a dichotomous value (1/2) * ''%%'NOMINAL'%%'' -- a numerical value with nominal coding * ''%%'ORDINAL'%%'' -- a numerical value with ordinal coding * ''%%'METRIC'%%'' -- a numerical value with metric coding (integer or decimal number) * ''%%'TEXT'%%'' -- a string (code or free text up to 64k characters long) * ''%%'TIME'%%'' -- a date and time specification (''YYYY-MM-DD hh:mm:ss'') * ''%%'DATE'%%'' -- a date specification (''YYYY-MM-DD'') * ''input'' -- Input format used for data collection * ''%%'OPEN'%%'' -- Open (text) input field * ''%%'CHECKBOX'%%'' -- Selection field on/off (checkbox or other visualization) * ''%%'SELECTION'%%'' -- Selection between different options (radio button or other visualization) * ''%%'SCALE'%%'' -- Selection between ordinal ordered options * ''%%'RANKING'%%'' -- Sort the option in relation to other options * ''%%'MEASURED'%%'' -- Data not explicitly provided by respondents, e.g. response times * ''%%'SYSTEM'%%'' -- Metadata and other information collected/managed by the system * ''%%'UNDEFINED'%%'' -- The input format for the variable is incorrect or not documented. **Note:** If the type of a question and thus the type of a variable is changed during the survey (e.g. from an open input field to a scale item), the data may also contain values other than those specified in the variable list.