Translations of this page:
 

API for Data Retrieval

SoSci Survey offers an API for data retrieval. Navigate to Collected DataAPI for Data Retrieval to create an API link.

API formats include:

  • JSON data
  • View single questionnaire
    This function may be used to allow third parties to view single questionnairess.

Before you create the API link ('+' symbol in the top right corner), you may choose which records (e.g., interview and/or pretest data, interview date, etc.) may be retrieved (and may not be retrieved).

JSON data retrieval

By default, the API link retrieves all collected data (according to the retrieval criteria defined for the API link).

Export

The JSON file contains the following attributes.

  • metadata
    Metadata, for example data retrieval timestamp, retrieval criteria (filter).
  • errors
    Potential retrieval errors, for example a non-existing variable. Only exists in the file if any errors occurred.
  • data
    The dataset. Each case is again a sub-attribute with the key C + case number.
    • C213
      Single cases (rows in the dataset) are objects with variable names as keys and their values stored as the object value. If there is no value for a variable (e.g., question not shown to participant, no answer), the variable is not listed.
  • variables
    Additional information about the variables and their labels. Only exists in the file if the parameters “infoVariables”, “infoValues” and/or “infoQuestionText” were added to the API call).
{
  "metadata": {
    "project": "z2018",
    "datetime": "2018-08-01 22:35:18",
    "filter": [
      "Only selected cases: CASE 120, 121"
    ],
    "language": "ger"
  },
  "data": {
    "C120": {
      "CASE": 120,
      "SERIAL": "H3PVKWVM6H",
      "QUESTNNR": "short",
      "MODE": "interview",
      "STARTED": "2018-03-30 21:16:33",
      "T109_01": 4,
      "FINISHED": 1,
      "Q_VIEWER": 0,
      "LASTPAGE": 2,
      "MAXPAGE": 2
    },
    "C121": {
      "CASE": 121,
      "SERIAL": "LSFK1ZX25B",
      "QUESTNNR": "short",
      "MODE": "interview",
      "STARTED": "2018-03-30 21:16:55",
      "T109_01": 2,
      "FINISHED": 1,
      "Q_VIEWER": 0,
      "LASTPAGE": 2,
      "MAXPAGE": 2
    }
  }
}

Parameter

All parameters are optional. If no parameters are added to the API link, all cases (as defined in the retrieval criteria) will be retrieved.

  • cases – Cases (CASE) to be retrieved.
    • You may add ranges (1-100) or single values. Separate multiple values by commas, for example 1,2,5,10-20.
    • The last (!) value may be a range without an endpoint, for example 101- to retrieve all cases starting from CASE 101.
    • Note: If cases are specified that were restricted during the API link creation, only non-restricted cases are retrieved. For example, if the API link is limited to cases 1-100 and the parameter cases is set to the range 50-150, cases 50 to 100 are retrieved.
    • Setting the parameter to none deactivates data retrieval, for example if only variable information is of interest.
    • SERIAL – Restriction to cases with this/these personal identifier(s)

* Multiple SERIALs can be specified separated by commas, e.g. ABCD12,BCDE23. * If both cases and SERIAL are specified, only cases that meet both criteria at the same time are retrieved (intersection).

  • vList – Variables to retrieve.
    • If the parameter vList is set, only (!) those variables will be retrieved.
    • Separate multiple variable labels with commas, e.g. CASE,AB01_01,AB01_02.
    • The parameter vSkip is ignored if vList is set.
  • vSkip – Skip single variables.
  • vSkipTime – Skip dwell time (TIME000) and last data timestamp (LASTDATA) variables.
  • vQuality – Add quality indicators (MISSING, MISSREL, TIME_RSI, DEG_TIME).
    Warning: Some of the quality indicators are normalized on the basis of the retrieved data. This may lead to missing or biased values if not all data is retrieved.
  • vAddress – Add contact info of panel members (email, phone number, UID) (only possible if panel data was imported as person-related data).
  • startMin – Retrieve only cases that started the questionnaire after this date.
    • Excludes all cases that started the questionnaire before this data (variable: STARTED).
    • May be set in the formats YYYY-MM-DD or YYYY-MM-DD SS:MM:SS, e.g. 2018-04-01T16:30:30 (“T” may be replaced by a single space).
  • startMax – Retrieve only cases that started the questionnaire until this date.
  • changed – Retrieve only cases that changed some data after this date.
  • infoVariables – Get additional variable information (e.g., scaling, input type).
  • infoValues – Get value labels.
  • infoQuestionText – Get question wording.

Examples:

Plain API link:

  https://www.soscisurvey.de/PROJEKT/?act=uDywDXaYyNEY

Retrieve only cases 120 and 121 and skip dwell time variables:

  https://www.soscisurvey.de/PROJEKT/?act=uDywDXaYyNEY&vSkipTime&cases=120,121

Retrieve only variables STARTED, AB01, AB02 and AB03_01 for cases that changed data after 2018-08-01 12:00:00:

  https://www.soscisurvey.de/PROJEKT/?act=uDywDXaYyNEY&vList=STARTED,AB01,AB02,AB03_01&changed=2018-08-01T12:00:00

Retrieve only variable labels:

  https://www.soscisurvey.de/PROJEKT/?act=uDywDXaYyNEY&cases=none&infoValues
  

Data retrieval as CSV

CSV (Comma Separated Values) files are compatible with almost any spreadsheet and statistics program. The same Parameters are available as for JSON retrieval.

CSV for Excel

In “CSV for Excel”, the data is optimized for import into Excel. The following parameters allow to customize the format:

  • decimal – Set the decimal separator used. Here the specifications “point” and “comma” are allowed. By default, the decimal separator is set based on the base language of the survey project.
  • missing – Handling of missing values, allowed are “code” (numeric code, default), “stata” (stata codes) and “remove” (remove missing values from the table).
  • encoding – File encoding, allowed are “utf-16” (UTF-16 LE for Excel, default), “utf-8” and “iso-8859-1”.

CSV for R

With “CSV for R” the data are formatted in such a way that they can be read optimally into GNU R. In addition, the parameter type=rScript can be used to retrieve an R script to import the CSV file. This script either uses local data if the parameter csvFile is specified or fetches the current data from the server using API.

  • useSettings – Use the settings specified in Download data (by default, default settings for CSV formatting are used).
  • rScript – Downloas the r script to import the csv data.
  • csvFile – Use a specific file name in the import script (by default, the data is loaded directly via HTTPS from the survey server).

Display of Individual Questionnaires

If an API link to view individual questionnaires is called in the browser, SoSci Survey offers an open input field for the number of the desired questionnaire (CASE). After entering the number the print preview is shown.

Note: If the API link only allows access to a single case, it will be displayed immediately without asking for the number.

Note: The print view is subject to the same restrictions as when called up via Collected dataView data → Printer icon. Among other things, questions with rotated items in completed interviews use a different (random) order of items than in the interview.

Records by Person Code (SERIAL)

An API link for listing records by person code must have a SERIAL parameter added to the call, e.g.

  https://www.soscisurvey.de/project/?act=sIyhrICdaBxl6qiMagedOU1K&SERIAL=ABC12345
  

Returned is an object in JSON notation that contains the query status (result) and some meta data (meta) as well as an array of which records have this person identifier (SERIAL) in the record.

{
    "result":"ok",
    "meta":{
        "project":"project",
        "datetime":"2020-07-14 16:03:00",
        "count":1
    },
    "cases":[
        {
            "CASE":"145",
            "SERIAL":"ABC12345",
            "FINISHED":true
        }
    ]
}

The meta-information count refers to the number of matching records. Only those records that meet the criteria specified when the API link was created will be searched.

Reference

JSON

The variables in the JSON-Download include the description (“label”) and the parameters “type” and “input”. The following values are possible.

  • type – type of variable/value range of the variable
    • 'BOOL' – a boolean value (true/false)
    • 'DICHOTOMOUS' – a dichotomous value (1/2)
    • 'NOMINAL' – a numerical value with nominal coding
    • 'ORDINAL' – a numerical value with ordinal coding
    • 'METRIC' – a numerical value with metric coding (integer or decimal number)
    • 'TEXT' – a string (code or free text up to 64k characters long)
    • 'TIME' – a date and time specification (YYYY-MM-DD hh:mm:ss)
    • 'DATE' – a date specification (YYYY-MM-DD)
  • input – Input format used for data collection
    • 'OPEN' – Open (text) input field
    • 'CHECKBOX' – Selection field on/off (checkbox or other visualization)
    • 'SELECTION' – Selection between different options (radio button or other visualization)
    • 'SCALE' – Selection between ordinal ordered options
    • 'RANKING' – Sort the option in relation to other options
    • 'MEASURED' – Data not explicitly provided by respondents, e.g. response times
    • 'SYSTEM' – Metadata and other information collected/managed by the system
    • 'UNDEFINED' – The input format for the variable is incorrect or not documented.

Note: If the type of a question and thus the type of a variable is changed during the survey (e.g. from an open input field to a scale item), the data may also contain values other than those specified in the variable list.

en/results/data-api.txt · Last modified: 04.07.2024 09:23 by patrizia.bieber
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
Driven by DokuWiki