Python API¶

Everything the app draws is available headless through the package's public functions. They follow one pipeline:

load_scanpath_data / load_sample_data / load_potec   →   (words, fixations)
        list_trials            →  which (participant, trial) combos exist
        compute_word_metrics   →  per-word reading measures (FFD/FPRT/RPD/TFD …)
        plot_scanpath          →  a static Plotly figure
        animate_scanpath       →  an animated replay figure
        save_figure            →  .html / .png / .svg / .pdf on disk

All names are importable straight from the package root:

import scanpath_studio as sps

words, fixations = sps.load_scanpath_data("ia.csv", "fixations.csv")
fig = sps.plot_scanpath(words, fixations, "p1", "t3", show_heatmap=False)
sps.save_figure(fig, "scanpath.png")

Lazy imports

import scanpath_studio stays cheap — pandas / plotly / streamlit are only pulled in on the first API call (the package re-exports api.py / datasets.py lazily). The first call therefore pays a one-time import cost.

Loading data¶

scanpath_studio.api.load_scanpath_data ¶

load_scanpath_data(words: Optional[TablesLike] = None, fixations: Optional[TablesLike] = None, *, word_schema: Optional[dict] = None, fix_schema: Optional[dict] = None) -> Tuple[pd.DataFrame, pd.DataFrame]

Load and normalize a words/IA table and/or a fixations table.

words / fixations may be DataFrames, paths to .csv / .tsv / .parquet / .feather files, glob patterns, or lists of paths — multi-file datasets (one file per participant and/or text) are concatenated, with each file's stem kept in a source_file column. Column schemas are auto-detected (EyeLink, Gazepoint, and snake_case names); pass word_schema / fix_schema mappings (field → column name, see controls.WORD_FIELD_SPECS) to override detection. For per-word reading measures, pass the result to :func:compute_word_metrics.

Either table may be omitted for datasets that ship only one report: the missing side comes back as an empty canonical frame and the plots simply skip that layer. Words without a participant column (stimulus-level AoIs) are broadcast across the participants found in the fixations, and fixations without x/y but with a word/AoI ID are placed at word-box centers.

Returns the normalized (words, fixations) frames the plotting functions expect. Raises ValueError if required fields can't be found.

scanpath_studio.api.load_sample_data ¶

load_sample_data() -> Tuple[pd.DataFrame, pd.DataFrame]

Return the bundled 3-participant OneStop demo, normalized and ready to plot.

scanpath_studio.datasets.load_potec ¶

load_potec(root, *, readers: Optional[Iterable] = None, texts: Optional[Iterable[str]] = None, download: bool = False) -> Tuple[pd.DataFrame, pd.DataFrame]

Load PoTeC as normalized (words, fixations) frames, ready to plot.

root is a clone of the PoTeC repo (with the eye-tracking data downloaded) or any folder; with download=True the needed files are fetched into it on first use (~45 MB). Narrow the load with readers (e.g. [0, 1]) and/or texts (e.g. ["b0", "p3"]) — the full corpus is 75 readers × 12 texts = 900 trials.

Participants are PoTeC reader ids (as strings), trials are text ids (b0–b5 biology, p0–p5 physics)::

words, fixations = load_potec("data/PoTeC", readers=[0], texts=["b0"])
fig = scanpath_studio.plot_scanpath(words, fixations)

The PoTeC monitor was 1680×1050 (DELL P2210, 60 Hz); pass that as canvas_size to :func:scanpath_studio.plot_scanpath for true-to-scale rendering.

Inspecting & measuring¶

scanpath_studio.api.list_trials ¶

list_trials(words: DataFrame, fixations: DataFrame) -> pd.DataFrame

Plottable (participant_id, trial_id) combos.

Combos present in both frames when both are loaded; for single-report datasets (words-only or fixations-only), combos from whichever frame has data.

scanpath_studio.api.compute_word_metrics ¶

compute_word_metrics(words: DataFrame, fixations: DataFrame) -> pd.DataFrame

Per-word reading measures (FFD/FPRT/RPD/TFD, skips, regressions, …).

Pre-aggregated columns in words (EyeLink IA exports) are preserved; anything missing is computed from fixations + word bounding boxes. Takes the normalized frames from :func:load_scanpath_data.

Plotting¶

Any keyword accepted by the underlying figure builder can be passed through plot_scanpath / animate_scanpath (e.g. show_heatmap=False, color_by="pass_index", saccade_color="#444", fixation_opacity=0.5, x_field="order_in_trial"). To colour saccades by reading type instead of one uniform colour, pass saccade_color_mode="By type" (optionally with a saccade_class_colors={"regression": "#000", …} palette). For the heatmap, heatmap_norm="Log" compresses heavy-tailed dwell times. For the linear-reading schematic, saccade_render_mode="Arc" arches the saccades and fixation_snap_to_word=True places each fixation above its word. To overlay an image stimulus (a screenshot of the reading screen) behind the scanpath, pass background_image="stim.png" with background_image_size=(w, h) (and background_image_origin=(x0, y0) for a centered crop); background_image_opacity dims a busy image so the AOIs / fixations read over it.

scanpath_studio.api.plot_scanpath ¶

plot_scanpath(words: DataFrame, fixations: DataFrame, participant: Optional[str] = None, trial: Optional[str] = None, *, canvas_size: Optional[Tuple[int, int]] = None, base_font_size: int = 16, font_family: str = FONT_FAMILY, raw_gaze: Optional[DataFrame] = None, **figure_overrides) -> go.Figure

Build the canonical scanpath figure for one trial.

words / fixations are normalized frames from :func:load_scanpath_data. participant / trial may be omitted when the frames contain exactly one combo. canvas_size is the monitor size in px; by default it is estimated from the data extents — pass the real monitor resolution (e.g. (2560, 1440) for OneStop) to keep coordinates true to scale. raw_gaze is a normalized frame (see :func:data.normalize_raw_gaze) and is filtered to the selected trial. Remaining keywords override the app's defaults and are forwarded to :func:plots.make_scanpath_figure (e.g. show_heatmap=False, color_by="pass_index", x_field="order_in_trial").

scanpath_studio.api.animate_scanpath ¶

animate_scanpath(words: DataFrame, fixations: DataFrame, participant: Optional[str] = None, trial: Optional[str] = None, *, canvas_size: Optional[Tuple[int, int]] = None, base_font_size: int = 16, font_family: str = FONT_FAMILY, playback_speed: float = 1.0, autoplay: bool = True, **animation_overrides) -> go.Figure

Build the animated scanpath replay for one trial.

Same trial selection and canvas semantics as :func:plot_scanpath. The returned Plotly figure plays in real reading time scaled by playback_speed; save it as interactive HTML with :func:save_figure, or rasterize to GIF/MP4 with :func:animation_export.export_animation.

With autoplay (default True, VIZ-10) the saved interactive HTML auto-starts the replay on load at playback_speed — :func:save_figure honors the marker the builder stamps on the figure. Pass autoplay=False to save a figure that opens paused (press ▶ Play to run it). Autoplay only affects the interactive HTML; GIF/MP4 rasterization renders every frame regardless.

The animation builder accepts a subset of the static figure's options (show_words, show_word_labels, show_saccades, show_order, styling, and second-scanpath overlays) — an unsupported key raises a ValueError naming the valid ones rather than an opaque TypeError.

Saving¶

scanpath_studio.api.save_figure ¶

save_figure(fig: Figure, path: Union[str, Path], *, scale: int = 2, width: Optional[int] = None, height: Optional[int] = None) -> Path

Save a figure by extension: .html (interactive, browser-free) or .png/.svg/.pdf (static via Kaleido — needs a Chrome/Chromium; run plotly_get_chrome -y once if missing). width / height set the raster output size in px (overriding the figure's intrinsic layout size); both ignored for .html. Returns the written path.

To keep the layers separable for publication editing, save_figure_layers writes one file per layer (word boxes / fixations / saccades / heatmap / labels / stimulus image) — each the full figure with only that layer and a transparent background, at the same size and axis ranges, so they register when stacked in Illustrator / Inkscape:

fig = sps.plot_scanpath(words, fixations, "p1", "t3", show_heatmap=True)
paths = sps.save_figure_layers(fig, "fig_layers", fmt="svg")   # {layer: Path}

scanpath_studio.api.save_figure_layers ¶

save_figure_layers(fig: Figure, directory: Union[str, Path], *, fmt: str = 'svg', scale: int = 2, width: Optional[int] = None, height: Optional[int] = None) -> dict

Split a scanpath figure into its layers and save one file per layer (VIZ-5).

Writes <directory>/<layer>.<fmt> for each visible layer (word boxes / fixations / saccades / heatmap / labels / stimulus image / frame) and returns {layer: Path}. Each layer is the full figure with only that layer's elements and a transparent background, at the same size and axis ranges — so the files register perfectly when stacked in Illustrator / Inkscape. fmt is any :func:save_figure extension without the dot (svg / pdf are vector and best for editing; png / html also work). scale / width / height are forwarded to :func:save_figure.

Saved interactive HTML autoplays on load at the playback speed by default; pass autoplay=False to save a replay that opens paused:

anim = sps.animate_scanpath(words, fixations, "p1", "t3", playback_speed=4.0)
sps.save_figure(anim, "replay.html")                       # autoplays on load
paused = sps.animate_scanpath(words, fixations, "p1", "t3", autoplay=False)
sps.save_figure(paused, "replay_paused.html")              # opens paused

For rasterized animation (GIF / MP4) use the animation exporter:

from scanpath_studio.animation_export import export_animation

anim = sps.animate_scanpath(words, fixations, "p1", "t3")
export_animation(anim, "replay.mp4")   # or replay.gif — needs Kaleido + Chrome

See Export & troubleshooting for the Chrome / ffmpeg requirements.