Skip to content

Python API

Everything the app draws is available headless through the package's public functions. They follow one pipeline:

load_scanpath_data / load_sample_data / load_potec   →   (words, fixations)
        list_trials            →  which (participant, trial) combos exist
        compute_word_metrics   →  per-word reading measures (FFD/FPRT/RPD/TFD …)
        plot_scanpath          →  a static Plotly figure
        animate_scanpath       →  an animated replay figure
        save_figure            →  .html / .png / .svg / .pdf on disk

All names are importable straight from the package root:

import scanpath_studio as sps

words, fixations = sps.load_scanpath_data("ia.csv", "fixations.csv")
fig = sps.plot_scanpath(words, fixations, "p1", "t3", show_heatmap=False)
sps.save_figure(fig, "scanpath.png")

Lazy imports

import scanpath_studio stays cheap — pandas / plotly / streamlit are only pulled in on the first API call (the package re-exports api.py / datasets.py lazily). The first call therefore pays a one-time import cost.

Loading data

scanpath_studio.api.load_scanpath_data

load_scanpath_data(words: Optional[TablesLike] = None, fixations: Optional[TablesLike] = None, *, word_schema: Optional[dict] = None, fix_schema: Optional[dict] = None) -> Tuple[pd.DataFrame, pd.DataFrame]

Load and normalize a words/IA table and/or a fixations table.

words / fixations may be DataFrames, paths to .csv / .tsv / .parquet / .feather files, glob patterns, or lists of paths — multi-file datasets (one file per participant and/or text) are concatenated, with each file's stem kept in a source_file column. Column schemas are auto-detected (EyeLink, Gazepoint, and snake_case names); pass word_schema / fix_schema mappings (field → column name, see controls.WORD_FIELD_SPECS) to override detection. For per-word reading measures, pass the result to :func:compute_word_metrics.

Either table may be omitted for datasets that ship only one report: the missing side comes back as an empty canonical frame and the plots simply skip that layer. Words without a participant column (stimulus-level AoIs) are broadcast across the participants found in the fixations, and fixations without x/y but with a word/AoI ID are placed at word-box centers.

Returns the normalized (words, fixations) frames the plotting functions expect. Raises ValueError if required fields can't be found.

scanpath_studio.api.load_sample_data

load_sample_data() -> Tuple[pd.DataFrame, pd.DataFrame]

Return the bundled 3-participant OneStop demo, normalized and ready to plot.

scanpath_studio.datasets.load_potec

load_potec(root, *, readers: Optional[Iterable] = None, texts: Optional[Iterable[str]] = None, download: bool = False) -> Tuple[pd.DataFrame, pd.DataFrame]

Load PoTeC as normalized (words, fixations) frames, ready to plot.

root is a clone of the PoTeC repo (with the eye-tracking data downloaded) or any folder; with download=True the needed files are fetched into it on first use (~45 MB). Narrow the load with readers (e.g. [0, 1]) and/or texts (e.g. ["b0", "p3"]) — the full corpus is 75 readers × 12 texts = 900 trials.

Participants are PoTeC reader ids (as strings), trials are text ids (b0b5 biology, p0p5 physics)::

words, fixations = load_potec("data/PoTeC", readers=[0], texts=["b0"])
fig = scanpath_studio.plot_scanpath(words, fixations)

The PoTeC monitor was 1680×1050 (DELL P2210, 60 Hz); pass that as canvas_size to :func:scanpath_studio.plot_scanpath for true-to-scale rendering.

Inspecting & measuring

scanpath_studio.api.list_trials

list_trials(words: DataFrame, fixations: DataFrame) -> pd.DataFrame

Plottable (participant_id, trial_id) combos.

Combos present in both frames when both are loaded; for single-report datasets (words-only or fixations-only), combos from whichever frame has data.

scanpath_studio.api.compute_word_metrics

compute_word_metrics(words: DataFrame, fixations: DataFrame) -> pd.DataFrame

Per-word reading measures (FFD/FPRT/RPD/TFD, skips, regressions, …).

Pre-aggregated columns in words (EyeLink IA exports) are preserved; anything missing is computed from fixations + word bounding boxes. Takes the normalized frames from :func:load_scanpath_data.

Plotting

Any keyword accepted by the underlying figure builder can be passed through plot_scanpath / animate_scanpath (e.g. show_heatmap=False, color_by="pass_index", saccade_color="#444", fixation_opacity=0.5, x_field="order_in_trial").

scanpath_studio.api.plot_scanpath

plot_scanpath(words: DataFrame, fixations: DataFrame, participant: Optional[str] = None, trial: Optional[str] = None, *, canvas_size: Optional[Tuple[int, int]] = None, base_font_size: int = 16, font_family: str = FONT_FAMILY, raw_gaze: Optional[DataFrame] = None, **figure_overrides) -> go.Figure

Build the canonical scanpath figure for one trial.

words / fixations are normalized frames from :func:load_scanpath_data. participant / trial may be omitted when the frames contain exactly one combo. canvas_size is the monitor size in px; by default it is estimated from the data extents — pass the real monitor resolution (e.g. (2560, 1440) for OneStop) to keep coordinates true to scale. raw_gaze is a normalized frame (see :func:data.normalize_raw_gaze) and is filtered to the selected trial. Remaining keywords override the app's defaults and are forwarded to :func:plots.make_scanpath_figure (e.g. show_heatmap=False, color_by="pass_index", x_field="order_in_trial").

scanpath_studio.api.animate_scanpath

animate_scanpath(words: DataFrame, fixations: DataFrame, participant: Optional[str] = None, trial: Optional[str] = None, *, canvas_size: Optional[Tuple[int, int]] = None, base_font_size: int = 16, font_family: str = FONT_FAMILY, playback_speed: float = 1.0, **animation_overrides) -> go.Figure

Build the animated scanpath replay for one trial.

Same trial selection and canvas semantics as :func:plot_scanpath. The returned Plotly figure plays in real reading time scaled by playback_speed; save it as interactive HTML with :func:save_figure, or rasterize to GIF/MP4 with :func:animation_export.export_animation.

The animation builder accepts a subset of the static figure's options (show_words, show_word_labels, show_saccades, show_order, styling, and second-scanpath overlays) — an unsupported key raises a ValueError naming the valid ones rather than an opaque TypeError.

Saving

scanpath_studio.api.save_figure

save_figure(fig: Figure, path: Union[str, Path], *, scale: int = 2, width: Optional[int] = None, height: Optional[int] = None) -> Path

Save a figure by extension: .html (interactive, browser-free) or .png/.svg/.pdf (static via Kaleido — needs a Chrome/Chromium; run plotly_get_chrome -y once if missing). width / height set the raster output size in px (overriding the figure's intrinsic layout size); both ignored for .html. Returns the written path.

For rasterized animation (GIF / MP4) use the animation exporter:

from scanpath_studio.animation_export import export_animation

anim = sps.animate_scanpath(words, fixations, "p1", "t3")
export_animation(anim, "replay.mp4")   # or replay.gif — needs Kaleido + Chrome

See Export & troubleshooting for the Chrome / ffmpeg requirements.