This spike explored pulling personal fitness data from the Strava API and Google Fitness REST API into a unified local store for the Life OS project. The goal: understand what data is available, how authentication works, what the rate limits are, and produce working proof-of-concept scripts that pull real data.
The physical setup: an Amazfit band syncs to the Zepp app on the phone. Zepp then syncs data to both Google Fit (steps, heart rate, sleep) and Strava (via HealthSync, which bridges workouts). Strava also receives cycling data directly from a Garmin Edge 1050. This means some data appears in both services, which the schema must handle.
The team of five (Product Owner, Strava Researcher, Google Fit Researcher, Schema Designer, Business Analyst) completed all research, scripts, and documentation in a single day.
| ID | Title | Artefact | Status | Owner |
|---|---|---|---|---|
| 1 | Strava API documentation | docs/spike-001/strava-api-notes.md |
Complete | Strava Researcher |
| 2 | Strava PoC connector script | src/connectors/strava/pull.js |
Complete | Strava Researcher |
| 3 | Google Fit API documentation | docs/spike-001/google-fit-api-notes.md |
Complete | Google Fit Researcher |
| 4 | Google Fit PoC connector script | src/connectors/google-fit/pull.js |
Complete (blocked by scopes) | Google Fit Researcher |
| 5 | Unified data schema proposal | docs/spike-001/schema-proposal.md |
Complete | Schema Designer |
| 6 | Acceptance criteria and sign-off | docs/spike-001/acceptance-criteria.md |
Complete | Product Owner |
| 7 | Spike report | reports/spike-001-report.html |
Complete | Business Analyst |
fetch and only core modules (fs, path). No npm install needed.dedupe_key; no auto-merge. HealthSync bridges Google Fit activities into Strava, creating duplicates. Both records are stored. A dedupe_key (date + activity type + duration bucket) lets the consumer decide how to handle overlaps.activity, health_metric, body_measurement, and medication. Future data sources (food logging, smart scales, calendar events) map into these without schema changes.Authentication: OAuth 2.0 with refresh tokens. The PoC script exchanges a refresh token for a fresh access token via POST https://www.strava.com/oauth/token. No browser interaction required. Access tokens expire after 6 hours (21,600 seconds). The refresh token may rotate on each refresh, so the latest must always be stored.
Endpoints tested:
GET /api/v3/athlete : authenticated athlete profile (name, city, weight, subscription status)GET /api/v3/athlete/activities : paginated activity list (max 200 per page, page-based pagination, empty array when exhausted)GET /api/v3/activities/{id} : full activity detail including calories, splits, laps, full polyline, segment effortsGET /api/v3/athletes/{id}/stats : lifetime and year-to-date totals for rides, runs, swimsData volume: 483 rides (12,905 km total, 148 km of climbing), 215 runs (1,358 km total), biggest single ride 165 km. Account created 2014-08-11. Over 10 years of activity data.
Devices observed:
garmin_ping_*. Provides GPS, HR, cadence, estimated power, temperature.strava_activity_upload.healthsync.fit. Walks and workouts.Units: all metric/SI. Distance in meters, speed in m/s, elevation in meters, temperature in Celsius, weight in kg, HR in bpm, power in watts, energy in kJ.
Rate limits:
X-RateLimit-Limit and X-RateLimit-Usage response headersKey gotchas:
resource_state: 2 (summary). Calories, splits, laps, and full polyline are only available at resource_state: 3 (detail endpoint).sport_type is the modern, more granular replacement for type. Strava recommends using it going forward.device_watts: false, power values are estimated by Strava from speed, weight, and terrain.before and after query params require Unix epoch timestamps, not ISO dates.Authentication: OAuth 2.0 with refresh tokens. Token refresh works. The script obtains a valid access token from POST https://oauth2.googleapis.com/token. However, all Fitness API calls return 403 Insufficient Permission because the current OAuth token was created without fitness-specific scopes.
What needs to happen to unblock:
929872323027-...fitness.activity.read (steps, activity segments)fitness.body.read (weight, body metrics)fitness.sleep.read (sleep stages and duration)fitness.heart_rate.read (heart rate from Amazfit band)prompt=consent to get a new refresh token that includes fitness scopesEndpoints documented:
GET /fitness/v1/users/me/dataSources : list all connected devices and data sourcesPOST /fitness/v1/users/me/dataset:aggregate : aggregate data by time buckets (primary endpoint for daily summaries)GET /fitness/v1/users/me/dataSources/{id}/datasets/{start}-{end} : raw, non-aggregated data pointsData types available:
| Data | API Name | Expected from Amazfit |
|---|---|---|
| Steps | com.google.step_count.delta | Yes |
| Heart rate | com.google.heart_rate.bpm | Yes |
| Sleep | com.google.sleep.segment | Yes (light, deep, REM stages) |
| Weight | com.google.weight | Manual entry only |
| Calories | com.google.calories.expended | Yes (estimated) |
| Distance | com.google.distance.delta | Yes (estimated from steps) |
| SpO2 | N/A | Unlikely (Zepp may not sync this) |
| Stress | N/A | No (Zepp-proprietary, not in Google Fit) |
Rate limits:
Time formats (gotcha): The aggregate endpoint uses milliseconds since epoch. The raw dataset endpoint uses nanoseconds in the URL path. Data points use startTimeNanos / endTimeNanos. Three different precisions in one API.
PoC script: The script at src/connectors/google-fit/pull.js is complete and handles token refresh, parallel fetching of data sources, steps, heart rate, and sleep, clean JSON output with parsed daily buckets, and graceful error handling. Once the scopes are added and the user re-authorizes, it should work without modification.
Deprecation note: Google Fit API is deprecated for new users as of 2024. Google is pushing Health Connect for Android. The REST API still works for existing users and returns data, but may not receive new features.
Four record types cover all current and planned data sources:
activity : Strava rides, runs, walks, workouts; Google Fit activity segmentshealth_metric : steps, heart rate, sleep, calories, distance (daily aggregates from Google Fit)body_measurement : weight from Strava profile, Google Fit, or a future smart scalemedication : Ozempic tracking (date + dose), extensible for other medicationsSource tagging: Every record includes a source object with four fields: api (strava, google_fit, manual), device, external_id, and upstream_id. This makes every record's origin unambiguous.
Timestamps: Always ISO 8601 in UTC. A separate timezone field (IANA name) and utc_offset (seconds) are stored where applicable. Google Fit epoch values are converted at ingest time.
HealthSync overlap: Store both the Strava and Google Fit copies. A dedupe_key field (date + activity type + duration bucket) lets the consumer identify likely duplicates at query time. No auto-merging.
Storage recommendation:
records table. Indexed columns for type, date, and source. A value_json column stores the full record for flexible querying via json_extract(). Migration from JSON is mechanical.Units: All metric/SI. Conversion to display units happens at the presentation layer. See the full unit reference in the schema proposal.
Extensibility: Adding a new data source requires only a pull script and a mapping function into existing record types. No schema migrations needed.
| AC | Criterion | Result |
|---|---|---|
| AC-1 | Strava OAuth token refresh works via script | PASS |
| AC-2 | Script pulls activities list, activity details, athlete profile | PASS |
| AC-3 | Google Fitness API auth works | PARTIAL: token refresh works, but fitness scopes need manual browser re-authorization |
| AC-4 | Script pulls steps, heart rate, sleep from Google Fit | PARTIAL: script is ready and will work once scopes are fixed. Steps to fix are fully documented. |
| AC-5 | Both scripts output clean JSON to stdout | PASS |
| AC-6 | Unified schema covers timestamps, data types, source tagging, storage format | PASS |
| AC-7 | Rate limits and quotas documented for both APIs | PASS |
| AC-8 | Spike report published | PASS |
6 of 8 acceptance criteria fully met. 2 partially met (AC-3 and AC-4). The partial results are due to the Google Fit OAuth token missing fitness scopes. This requires a one-time manual browser flow to re-authorize with the correct scopes. The fix is fully documented with step-by-step instructions. Once done, the existing PoC script will work without modification.
Strava is fully working. The PoC script successfully refreshes tokens, pulls athlete profile, recent activities (paginated), activity detail (with splits, laps, calories), and lifetime stats. Over 10 years of cycling and running data (483 rides, 215 runs) is available and accessible.
Google Fit script is ready but needs a manual step. The PoC script handles token refresh, parallel data fetching, clean JSON output, and error handling. It is complete and tested against the auth flow. The only blocker is a one-time browser-based re-authorization to add fitness scopes to the OAuth token.
The schema is comprehensive. Four record types cover all current data sources (Strava, Google Fit) and planned future sources (food logging, Ozempic, smart scales). Every record is source-tagged. HealthSync overlap is handled with a dedupe strategy. JSON now, SQLite later.
prompt=consent. Full instructions are in docs/spike-001/google-fit-api-notes.md.dedupe_key, but consumers need to be aware of potential double-counting in aggregations.