User Stats API

The User Stats API provides comprehensive statistics about Project Sidewalk users and their contributions in 伊利諾州芝加哥, including labels placed, distance explored, and validation activities. Each user is identified by an anonymized ID, which persists over time.

User Stats API Preview

Below is a live preview of user statistics in 伊利諾州芝加哥 retrieved directly from the API, showing the distribution of user contributions and label accuracy.

Loading user stats data...

Endpoint

Retrieve statistics for all registered users or filter based on specific criteria. See Query Parameters below.

GET /v3/api/userStats

Examples

/v3/api/userStats?filetype=json Get all user stats for 伊利諾州芝加哥 in JSON (default)

/v3/api/userStats?filetype=csv Get all user stats for 伊利諾州芝加哥 in CSV

/v3/api/userStats?filetype=csv&highQualityOnly=true Get all user stats for users marked as high_quality (in CSV)

/v3/api/userStats?filetype=json&minLabels=10 Get all user stats for users with 10 labels or more (in JSON)

/v3/api/userStats?filetype=json&minLabels=10&min_accuracy=0.9 Get all user stats for users with 10 labels or more and a 90% accuracy or better (in JSON)

Quick Download

Download user statistics data directly in your preferred format:

Query Parameters

This endpoint accepts the following optional query parameters.

Parameter Type Description
filetype string Specify the output format. Options: json (default), csv.
minLabels integer Filter users with at least this many total labels. Default: 0 (no minimum).
min_meters number Filter users who have explored at least this many meters. Default: 0 (no minimum).
min_accuracy number Filter users with at least this label accuracy (0.0-1.0). Users without validation data will be excluded.
highQualityOnly boolean When set to true, only include users flagged as high quality contributors. Default: false.

Responses

Success Response (200 OK)

On success, the API returns an HTTP 200 OK status code and the requested data in the specified filetype format.

JSON Format (Default)

Returns an array of user statistics objects, each representing a single user's contribution data:

[
    {
        "user_id": "bfab6670-0955-440c-abe8-01c2d20696ba",
        "labels": 27,
        "meters_explored": 154.8437957763672,
        "labels_per_meter": 0.17436927556991577,
        "high_quality": true,
        "high_quality_manual": null,
        "label_accuracy": 0.9545454382896423,
        "validated_labels": 22,
        "validations_received": 22,
        "labels_validated_correct": 21,
        "labels_validated_incorrect": 1,
        "labels_not_validated": 5,
        "validations_given": 20,
        "dissenting_validations_given": 5,
        "agree_validations_given": 14,
        "disagree_validations_given": 6,
        "unsure_validations_given": 0,
        "stats_by_label_type": {
            "curb_ramp": {
            "labels": 16,
            "validated_correct": 15,
            "validated_incorrect": 1,
            "not_validated": 0
            },
            "no_curb_ramp": {
                "labels": 0,
                "validated_correct": 0,
                "validated_incorrect": 0,
                "not_validated": 0
            },
            // ... other label types
        }
    },
    // ... more user statistics objects
]
JSON Field Descriptions

Each user statistics object contains the following fields:

Field Type Description
user_idstringAnonymized unique identifier for the user.
labelsintegerTotal number of labels placed by the user.
meters_explorednumberTotal distance explored by the user in meters.
labels_per_meternumber | nullAverage number of labels placed per meter explored, or null if no distance explored.
high_qualitybooleanWhether the user is flagged as a high-quality contributor based on algorithmic assessment.
high_quality_manualboolean | nullManual override of high-quality status by administrators, or null if not set.
label_accuracynumber | nullAccuracy of the user's labels based on validations, ranging from 0.0 to 1.0, or null if no validations.
validated_labelsintegerNumber of the user's labels that have been validated by others.
validations_receivedintegerTotal number of validations received on the user's own labels.
labels_validated_correctintegerNumber of the user's labels validated as correct.
labels_validated_incorrectintegerNumber of the user's labels validated as incorrect.
labels_not_validatedintegerNumber of the user's labels that have not been validated.
validations_givenintegerTotal number of validations performed by the user on others' labels.
dissenting_validations_givenintegerNumber of validations where the user disagreed with the majority.
agree_validations_givenintegerNumber of validations where the user agreed with the label.
disagree_validations_givenintegerNumber of validations where the user disagreed with the label.
unsure_validations_givenintegerNumber of validations where the user was unsure about the label.
stats_by_label_typeobjectBreakdown of statistics by label type.
Label Type Statistics Fields

The stats_by_label_type object contains a key for each label type, with values that provide detailed statistics for that specific type of label:

Field Type Description
stats_by_label_type.[type]objectStatistics for a specific label type (e.g., "curb_ramp", "obstacle"). The available label types match those in the Label Types API, but are provided in snake_case format.
stats_by_label_type.[type].labelsintegerNumber of labels of this type placed by the user.
stats_by_label_type.[type].validated_correctintegerNumber of this type of label validated as correct.
stats_by_label_type.[type].validated_incorrectintegerNumber of this type of label validated as incorrect.
stats_by_label_type.[type].not_validatedintegerNumber of this type of label not yet validated.

CSV Format

If filetype=csv is specified, the response body will be CSV data. The first row contains the header fields, with the stats_by_label_type object flattened into individual columns for each label type and statistic.

user_id,labels,meters_explored,labels_per_meter,high_quality,high_quality_manual,label_accuracy,validated_labels,...
bfab6670-0955-440c-abe8-01c2d20696ba,27,154.8437957763672,0.17436927556991577,true,,0.9545454382896423,22,...
814f4169-98a1-4afa-80da-3b46be1da405,687,9898.09765625,0.06940727680921555,true,,0.8013029098510742,614,...
...
CSV Column Descriptions

In CSV format, each row corresponds to a user, and the columns map to the JSON fields as follows:

  • The first set of columns match the top-level attributes from the JSON format (e.g., user_id, labels, meters_explored, etc.)
  • The label type statistics are flattened into a set of columns for each label type, with the naming pattern [label_type]_[statistic]
  • For example, curb_ramp_labels, curb_ramp_validated_correct, curb_ramp_validated_incorrect, curb_ramp_not_validated, etc.
  • This flattened structure makes it easier to import the data into spreadsheet applications and data analysis tools

Error Responses

If an error occurs, the API will return an appropriate HTTP status code and a JSON response body containing details about the error.

  • 400 Bad Request: Invalid parameter values.
  • 404 Not Found: The requested resource does not exist.
  • 500 Internal Server Error: An unexpected error occurred on the server.
  • 503 Service Unavailable: The server is temporarily unable to handle the request.

Error Response Body

Error responses include a JSON body with the following structure:

{
    "status": 400, // HTTP Status Code
    "code": "INVALID_PARAMETER", // Machine-readable error code
    "message": "Invalid value for filetype parameter. Expected 'csv' or 'json'.", // Human-readable description
    "parameter": "filetype" // Optional: The specific parameter causing the error
}

Data Analysis Tips

The User Stats API provides rich data for analysis. Here are some tips for meaningful analysis:

  • Consider using minimum thresholds for label count and validated labels to ensure sufficient data for meaningful analysis
  • Look beyond just quantity - high label counts don't always equate to high-quality data
  • Analyze the relationship between labels per meter and accuracy to understand contribution thoroughness
  • Compare validation patterns across different types of labels to identify where quality issues might exist
  • Use the Label Types API to get proper color coding and descriptions for visualizations

Related APIs

For more comprehensive analysis, consider using the User Stats API in conjunction with:

Contribute

Project Sidewalk is an open-source project created by the Makeability Lab and hosted on GitHub. We welcome your contributions! If you found a bug or have a feature request, please open an issue on GitHub.

You can also email us at sidewalk@cs.uw.edu

Project Sidewalk in Your City!

If you are interested in bringing Project Sidewalk to your city, please read our Wiki page.

On This Page