Installation

Charton is built with a modular, pay-for-what-you-use architecture. By leveraging Cargo features, you can finely tune your compilation to match your production constraints—whether you are deploying to a resource-constrained WebAssembly environment, building high-throughput multi-threaded backends, or bridging to existing Python data science pipelines.

Adding Charton to Your Project

To get started with the standard, single-threaded configuration, add Charton to your Cargo.toml:

[dependencies]
charton = "0.5"

Tailoring with Cargo Features

For production deployments, we strongly recommend enabling specific features to unlock advanced performance and backend rendering capabilities:

[dependencies]
# Example 1: Enable multi-threaded data processing for massive Polars DataFrames
charton = { version = "0.5", features = ["parallel"] }

# Example 2: Enable native image and document encoders for direct file exports
charton = { version = "0.5", features = ["png", "pdf"] }

# Example 3: Enable the high-speed interop bridge for Altair/Matplotlib integration
charton = { version = "0.5", features = ["bridge"] }

Feature Flag Matrix

Feature FlagCore Mechanism & DependenciesTarget Use Case
parallelActivates Rayon-backed parallel computation for scale arbitration and geometry derivation.Processing high-density data, such as financial market depth charts or massive scatter plots.
pngPulls in native PNG encoding backends, activating the .save("out.png") API.Automated server-side report generation and automated dashboard asset caching.
pdfIntegrates a vector PDF document renderer, activating the .save("out.pdf") API.Generating publication-quality vector figures compliant with top-tier scientific journals (e.g., NEJM).
bridgeInitiates a high-speed IPC channel to map Charton layers directly onto Python Altair/Matplotlib abstract syntax trees.Dual-stack data pipelines, gradual migration, or reusing mature, domain-specific Python plotting scripts.

Quick Start

Charton’s API design mirrors the declarative philosophy of the Grammar of Graphics. To balance rapid prototyping flexibility with production-grade engineering rigor, Charton offers a dual-API paradigm: a fluid, concise chart! macro syntax and a deterministic, explicitly managed Chart::build Builder API.

Swift Prototyping with Macros

For data exploration, standalone scripts, or interactive notebook environments, the chart! macro offers an elegant, one-liner fluid interface to bind and map raw vectors instantaneously.

use charton::prelude::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Prepare raw observation vectors (Physical measurements: Height vs. Weight)
    let height = vec![160.0, 165.0, 170.0, 175.0, 180.0];
    let weight = vec![55.0, 62.0, 68.0, 75.0, 82.0];

    // 2. Linear declarative pipeline: bind -> instantiate mark -> map encoding -> save
    chart!(height, weight)?
        .mark_point()?
        .encode((alt::x("height"), alt::y("weight")))?
        .save("out.svg")?;

    Ok(())
}

Production-Grade Builder API

While the macro interface is exceptional for quick iterations, enterprise applications demand explicit control over data structures and memory boundaries. The Chart::build API decouples data layout from visual marks, ensuring absolute type safety and allowing for dynamic dataset mutation.

use charton::prelude::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let height = vec![160.0, 165.0, 170.0, 175.0, 180.0];
    let weight = vec![55.0, 62.0, 68.0, 75.0, 82.0];

    // 1. Explicitly manage the lifecycle of your Dataset
    let mut ds = Dataset::new()
        .with_column("height", height)?
        .with_column("weight", weight)?;

    // Note: If you need to append data dynamically within conditional branches or loops,
    // utilize the `add_column` method instead:
    // ds.add_column("age", vec![20, 22, 25, 30, 35])?;

    // 2. Build the chart deterministically via a strongly-typed constructor pipeline
    Chart::build(ds)?
        .mark_point()?
        .encode((alt::x("height"), alt::y("weight")))?
        .save("production_out.svg")?;

    Ok(())
}

High-Performance Polars Integration

Charton provides native, high-efficiency ingestion interfaces for Polars DataFrames. To shield your codebase from Polars’ rapid API evolution, Charton ships with versioned compilation macros to maintain bulletproof backwards compatibility.

use polars::prelude::*;
use charton::prelude::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Instantiate a standard Polars DataFrame
    let df = df![
        "height" => vec![160.0, 165.0, 170.0, 175.0, 180.0],
        "weight" => vec![55.0, 62.0, 68.0, 75.0, 82.0]
    ]?;

    // 2. Perform zero-copy / highly efficient conversion into a Charton Dataset 
    // using the optimized version-specific macro
    let ds = load_polars_df!(df)?;

    // 3. Bind to the production Builder API
    Chart::build(ds)?
        .mark_point()?
        .encode((alt::x("height"), alt::y("weight")))?
        .save("polars_chart.svg")?;

    Ok(())
}

⚠️ Polars Version Compatibility Reference:

  • Polars 0.53+: Use the modern standard macro load_polars_df!(df)?.
  • Polars 0.44 - 0.52: Use the legacy support macro load_polars_v44_52!(df)?.
  • Polars < 0.44: Unsupported. Upgrading your upstream Polars dependency is highly recommended.

Interactive Notebooks

In modern data engineering and scientific exploration, immediate feedback loops are essential. Charton natively implements runtime hooks for the Rust Jupyter kernel (evcxr_jupyter). By replacing disk-bound .save() sequences with the specialized .show() terminal API, you can render high-fidelity, inline SVG vector visuals instantly within individual notebook cells.

Prerequisites

Before initiating plotting routines inside a notebook, ensure the underlying EVCXR kernel is initialized globally within your local system architecture. See the Jupyter/evcxr article.

Notebook Inline Execution Blueprint

Create a fresh cell inside your Jupyter Notebook using the Rust (evcxr) kernel, and input the following configuration:

#![allow(unused)]
fn main() {
:dep charton = { version = "0.5" }
:dep polars = { version = "0.53", features = ["lazy"] }

use charton::prelude::*;
use polars::prelude::*;

// 1. Initialize evaluation dataframe
let df = df!["x" => [1, 2, 3], "y" => [10, 20, 30]].unwrap();
let ds = load_polars_df!(df).unwrap();

// 2. Compose the declarative graph layer
let chart = Chart::build(ds).unwrap()
    .mark_point().unwrap()
    .encode((alt::x("x"), alt::y("y"))).unwrap();

// 3. Execute inline visualization: renders high-fidelity vector graphics right into the notebook cell
chart.show().unwrap();
}

Deep Dive: Seamless Environment Self-Adaptation

The .show() method can seamlessly adapt its output behavior whether it is executed inside a live, interactive Jupyter workspace or run within a traditional console application. This resilience is achieved via Charton’s internal runtime environment probing mechanism.

Take a look at how .show() is engineered under the hood:

#![allow(unused)]
fn main() {
pub fn show(&self) -> Result<(), ChartonError> {
    // 1. Core Execution: Serialize the in-memory chart nodes and abstract layers into an SVG string
    let svg_content = self.to_svg()?;

    // 2. Probing Environment: Query the active process context for the EVCXR runtime signature
    if std::env::var("EVCXR_IS_RUNTIME").is_ok() {
        // 3. Protocol Handshake: When explicitly run within a Jupyter container,
        // intercept standard output and stream the tailored HTML payload wrapper
        println!(
            "EVCXR_BEGIN_CONTENT text/html\n{}\nEVCXR_END_CONTENT",
            svg_content
        );
    }

    // 4. Fallback Boundary Safety: If triggered within a regular CLI, microservice, 
    // or CI/CD test harness, this method safely concludes with an Ok(()) without
    // polluting standard error or panicking.
    Ok(())
}
}

The Charton Mental Model

This chapter introduces the design philosophy of Charton. Understanding the "Mental Model" is more important than memorizing APIs, as it governs how you structure data and compose complex, production-grade visualizations in Rust.

The Declarative Paradigm: "What, Not How"

Charton is built on the Grammar of Graphics. In traditional imperative plotting libraries, you manually calculate coordinates and "draw" shapes on a canvas. In Charton, you provide a Specification that describes the relationship between data and visual properties.

#![allow(unused)]
fn main() {
Chart::build(&df)?
    .mark_point()           // Geometric Mark: Point
    .encode((
        x("weight"),        // Position Encoding: Map "weight" to X-axis
        y("horsepower"),    // Position Encoding: Map "horsepower" to Y-axis
        color("origin"),    // Visual Encoding: Map "origin" to Color
    ))?
}

The Orchestrator: LayeredChart

The LayeredChart is the central orchestrator of the system. It is not just a container; it is a State Machine that manages the visualization lifecycle through three key roles:

  1. Structural Intent: It holds multiple Layer objects. You can overlay a scatter plot, a regression line, and annotation text, ensuring they all exist within a shared logical space.
  2. Stateful Overrides: It stores user-defined overrides for domains, ranges, and layouts. These explicit instructions take precedence over the default Theme.
  3. Physical Awareness: It bridges the gap between abstract math and pixels by managing canvas dimensions and coordinate transformations.

Scale Arbitration & Global Aesthetics

A core challenge in multi-layer charts is visual consistency. Charton solves this through Scale Arbitration:

  • Unified Domains: Charton scans every layer to calculate a global ScaleDomain. If Layer A ranges from $[0, 10]$ and Layer B from $[5, 15]$, the LayeredChart automatically aligns the axis to $[0, 15]$.
  • Aesthetic Consistency: The system maintains a unified visual language. If multiple layers map the "origin" column to Color, Charton ensures they share the exact same palette and legend, preventing conflicting visual cues.

Space and Layout: From Logic to Physical

Charton treats space as a first-class citizen, separating the "logic" of a chart from its "physical" appearance:

  • Coordinate Systems: Charton supports Cartesian and Polar systems. The underlying mark logic remains identical; the coordinate system simply handles the transformation of normalized $[0, 1]$ values into canvas positions.
  • The Layout Engine: Charton uses a greedy stacking algorithm (similar to Flexbox). It automatically calculates the space required for titles, axis labels, and legends, ensuring the "Plot Panel" is perfectly centered and legible.

Performance & Rust's Ownership Model

Charton is engineered for the Rust ecosystem, leveraging its unique strengths:

  • Zero-Copy with Polars: By utilizing the Apache Arrow format, Charton processes massive DataFrames with minimal memory overhead.
  • Thread-Safe Resolution: Resolved scales are stored in Arc<RwLock<...>>. This allows the rendering backend to safely access scale metadata across multiple threads, enabling high-performance parallel rendering.
  • The Version Bridge: To solve "dependency hell," the bridge module allows passing data as Parquet-serialized bytes. This ensures Charton works seamlessly even if your project uses a version of Polars different from the one Charton was compiled with.

Summary: The Charton Workflow

  1. Data: Prepare your Polars DataFrame.
  2. Specification: Compose Layers using Marks and Encodings.
  3. Arbitration: LayeredChart resolves global Scales, Guides, and Layouts.
  4. Rendering: The RenderBackend (SVG/PNG/PDF) translates geometry into the final output.

System Architecture

Charton’s architecture is designed around the principle of Separation of Concerns. To transform a high-level user specification into a physical image, the system utilizes a four-layer pipeline: Input, Core, Render, and Output.

The Four-Layer Architecture

The following diagram illustrates the flow of information through the system:

I. The Input Layer (Data Ingestion)

The Input layer acts as the entry point for all data. Charton is built on the Arrow memory format, allowing for zero-copy integration with high-performance data libraries.

  • Data Sources: Accepts structured dataframes or serialized streams.
  • Bridge System: Provides a language-agnostic interface that allows the core engine to receive data from different environments (e.g., Python) without version conflicts.

II. The Core Layer (The Specification Engine)

This is the "Brain" of Charton. It is responsible for the logical interpretation of the user’s intent. It consists of three primary sub-systems:

  • Specification (Spec): Stores the "Blueprints" of the chart—which columns go to which axes, which colors are used, and which geometric marks are applied.
  • Scale Arbitration: A critical phase where the engine scans all layers to find global data boundaries (min/max or unique categories) to ensure all layers are visually synchronized.
  • Aesthetic Mapping: Resolves abstract data values into normalized [0, 1] ratios, which are later mapped to physical properties like hex codes or point shapes.

III. The Render Layer (The Geometric Factory)

Once the Core layer has resolved the mathematical logic, the Render layer converts these abstractions into geometry.

  • Coordinate Transformation: Translates normalized data into physical canvas coordinates based on the selected system (e.g., Cartesian, Polar, or Geographic).
  • Layout Engine: A "Flex-box" style system that greedily calculates space for marginalia—axes, titles, and legends—ensuring the plot panel occupies the remaining space correctly.
  • Mark Generation: Generates specific instructions for paths, circles, and rectangles.

IV. The Output Layer (The Backend)

The final layer translates geometric instructions into a specific file format or display buffer.

  • Vector Output: Generates SVG or PDF files for infinite scalability and web integration.
  • Raster Output: Renders high-performance PNG or JPEG images for reports and dashboards.
  • Specification Output: Can export the entire chart state as a JSON specification (compatible with Vega-Lite) for use in frontend applications.

The Visualization Lifecycle

Understanding the architecture requires looking at the Lifecycle of a single chart:

  1. Definition Phase: The user defines the Mark and Encoding.
  2. Training Phase: The system "trains" its scales by looking at the data limits of every layer.
  3. Resolution Phase: Scales are "frozen," and layout constraints (like the width of the Y-axis labels) are calculated.
  4. Assembly Phase: The layout engine allocates space, and the coordinate system maps the data to the final "Rect" of the plot.
  5. Drawing Phase: The backend executes the final draw calls.

Key Takeaway

By decoupling the Core Logic (what the data means) from the Render Logic (where the pixels go), Charton allows users to swap coordinate systems or output formats without ever changing their data analysis code.

From Data to Pixels (The Chart Life Cycle)

A chart in Charton is a dynamic sequence of transformations. This chapter traces the "biography" of a data point—from its raw state in a Polars DataFrame to its final geometric representation on a canvas.

Phase 1: Specification (The "Lazy" Definition)

When you call Chart::build() and chain methods like .mark_point() or .encode(), Charton does not perform any calculations. Instead, it populates a ChartSpec.

  • Intent Gathering: The system records which columns are mapped to which channels (X, Y, Color, etc.).
  • Lazy Evaluation: Data remains in its original DataFrame. This allows you to define complex multi-layer charts without triggering expensive computations prematurely.

Phase 2: Training (The Arbitration)

As defined in the Layer trait, the system must perform a "Training" phase before rendering.

  • Data Extraction: The LayeredChart triggers get_data_bounds() for every layer.
  • Columnar Efficiency: Thanks to the Apache Arrow integration, Charton accesses contiguous memory slices (&[T]) for specific columns. This "Columnar" approach minimizes CPU cache misses.
  • Domain Resolution: The orchestrator merges these bounds into a global ScaleDomain. This is where the "mathematical truth" of the chart is established.

Phase 3: Layout Negotiation

Before a single pixel is drawn, Charton must solve a spatial puzzle: How much space is left for the data after placing the labels? As seen in layout.rs, Charton uses a Greedy Stacking Algorithm:

  • First Pass (Measurement): The engine estimates the width/height of axis titles and tick labels based on font metrics.
  • Constraint Calculation: It subtracts these dimensions from the total canvas size to determine the PanelContext—the exact "Physical Rectangle" where the data marks will live.
  • Legend Placement: Legends are "stacked" (either vertically or horizontally) using a Flex-box style logic, further refining the available plotting area.

Phase 4: Realization (Mapping to Geometry)

Now the abstract data meets physical space. The Mapper system takes over:

  • Coordinate Translation: The CoordSystem transforms normalized data values into physical (x, y) coordinates within the Plot Panel.
  • Visual Mapping: The VisualMapper converts normalized ratios into concrete visual properties:
    • 0.5 $\rightarrow$ #ff0000 (Color)
    • 0.8 $\rightarrow$ PointShape::Diamond (Shape)
    • 0.2 $\rightarrow$ 4.0px (Radius)

Phase 5: Rendering (The Final Output)

The final stage involves the RenderBackend. Charton iterates through the resolved geometric primitives (Circles, Paths, Rects) and translates them into the target format:

  • SVG/PDF: Generates vector instructions for high-fidelity documents.
  • PNG/Raster: Uses hardware-accelerated drawing for high-performance previews.
  • HTML/Canvas: Renders interactive frames for web environments.

Key Takeaway: The "Single-Pass" Advantage

Because Charton resolves all scales and layouts before the drawing phase, the actual rendering is a "Single-Pass" operation. This predictable flow is what enables Charton to handle millions of points with near-zero latency, as the expensive logical "negotiations" are handled once per frame.

Scale Arbitration

In a single-layer chart, mapping data to scales is straightforward. However, Charton’s power lies in its ability to compose multiple independent layers into a unified visualization. This requires Scale Arbitration—a mechanism that synchronizes disparate data domains into a single, consistent visual language.

The Problem: Domain Mismatch

Imagine you have two layers in one chart:

  1. Layer A (Scatter): Data ranges from $10$ to $50$.
  2. Layer B (Trendline): Data ranges from $5$ to $60$.

If each layer calculated its own scale, Layer A’s $50$ would be at the very top of the chart, while Layer B’s $50$ would be somewhere in the middle. Scale Arbitration prevents this visual "hallucination" by ensuring every layer agrees on the same mathematical boundaries.

The Arbitration Lifecycle

As discussed in the Chart Lifecycle (Chapter 3), arbitration happens during the Training Phase. It follows a three-step process:

Collection (Local Bounds)

The LayeredChart orchestrator calls get_data_bounds() on every individual layer. Each layer reports its "Local Domain"—the min/max for continuous data or a set of unique categories for discrete data.

Merging (Global Union)

The system performs a Union of all local domains:

  • Continuous Scales: It finds the "Global Min" and "Global Max" across all layers.
  • Discrete Scales: It creates a deduplicated set of all categories (e.g., if Layer A has ["A", "B"] and Layer B has ["B", "C"], the global domain becomes ["A", "B", "C"]).

Expansion & Padding

Once the global raw domain is found, Charton applies Expansion Rules. By default, it adds a $5%$ padding to continuous scales to prevent data marks from clipping against the chart edges.

Shared Aesthetics (Color, Shape, Size)

Arbitration isn't just for X and Y axes; it applies to all visual channels.

  • Color Synchronization: If Layer A maps "Category" to color and Layer B maps "Category" to shape, the arbitrator ensures they both use the same categorical order.
  • Legend Merging: When multiple layers use the same data field for different aesthetics (e.g., both Color and Size represent "Sales"), the arbitrator merges them into a single, coherent legend block.

Conflict Resolution & Overrides

What happens if Layer A wants a Linear scale but Layer B wants a Log scale?

  1. Type Priority: Charton follows a strict hierarchy. If any layer explicitly requests a specific scale type, that type is prioritized. If types are fundamentally incompatible, the system returns a ChartonError.
  2. Manual Overrides: You can "break" the automatic arbitration by providing an explicit domain in the LayeredChart configuration.
#![allow(unused)]
fn main() {
chart.with_x_domain(0.0, 100.0); // This domain will be forced, regardless of layer data.
}

Why it Matters: The "Single Source of Truth"

By centralizing scale logic in the LayeredChart, Charton ensures:

  • Mathematical Integrity: $X=10$ always means the same physical pixel for every layer.
  • Parallel Safety: Because the scales are resolved and "frozen" before rendering starts, multiple threads can safely read the mapping logic without worrying about state changes.

Summary

Scale Arbitration is the "invisible hand" that negotiates between independent layers to produce a single, accurate coordinate system. It transforms a collection of data fragments into a unified visual story.

Rendering Primitives & Backend Contract

This chapter defines the unified, internal rendering contract that powers all Charton backends (SVG, PNG/Raster, PDF, WGPU). These primitives are not exposed to end users—they form the low-level geometry layer that translates declarative marks (Point, Line, Boxplot, Text, Area) into pixels or vectors.

The design follows strict principles of semantic clarity, cross-backend consistency, performance optimization, and implementation simplicity. Every method has a single, well-defined responsibility.

Core Design Principles

Charton’s RenderBackend is engineered around five non-negotiable constraints:

  1. Backend Agnosticism: A single implementation of marks (Point, Line, Boxplot, Area) must work identically across vector (SVG, PDF) and raster (PNG, WGPU) targets without modification. The same input produces identical visual output regardless of backend.

  2. Semantic Separation & Data-Driven Routing: Each drawing method corresponds to a distinct geometric category. We do not force primitives to guess intents. Complex geometries are routed automatically via lightweight topology hints (e.g., PathTopology), ensuring the backend always selects the optimal hardware pathway.

  3. Cross-Backend Semantic Consistency: All backends must implement exactly the same set of primitives with identical semantics. What draw_polygon means in SVG is exactly what it means in PNG and WGPU. This is the foundation of Charton's reliability.

  4. Performance by Design (The 4-Tier Architecture): Primitives are optimized for their intended use cases: instanced SDF for circles on GPU, line extrusion for fast strokes, hardware Triangle Fans for convex polygons, and heavy Stencil-Then-Cover (STC) pipelines solely reserved for highly complex maps.

  5. Deferred Typography: Font rendering and glyph layout are highly platform-dependent. The backend contract treats text as a "Deferred Ledger," collecting text commands during the geometric pass and executing them natively via the host environment (HTML5 Canvas 2D or CPU-composited Skia) to guarantee pixel-perfect legibility and zero GPU-atlas overhead.

The RenderBackend Primitives: Semantics & Usage

Below is the formal definition of each primitive, its semantic role, performance characteristics, and intended use cases across all backends.

  1. draw_circle(&mut self, config: CircleConfig)
  • Semantics: Renders a perfect circle defined by a center point and radius.

  • Implementation:

    • Vector: Native primitive.
    • CPU Raster: Path-built vector circles rendered via anti-aliased scan-conversion.
    • GPU Raster (WGPU): Instanced SDF (Signed Distance Field) shader evaluated on a single quad.
  • Performance: Extremely fast. Constant-time shader math on GPU. Ideal for millions of scatter plot markers.

  • Use Cases: Scatter points, outlier dots in boxplots, radar chart vertices.

  1. draw_rect(&mut self, config: RectConfig)
  • Semantics: Draws an axis-aligned rectangle from position, width, and height, with optional rounded corners.
  • Implementation:
    • Vector: Native <rect> primitive with rx/ry attributes.
    • CPU Raster: Optimized direct pixel bounding-box fill.
    • GPU Raster (WGPU): Instanced quad with fragment shader clipping for rounded corners.
  • Performance: Near-optimal. Simple bounds check, zero complex geometry generation. Supports millions of instances.
  • Use Cases: Bar marks, boxplot bodies, heatmap cells, UI backgrounds.
  1. draw_line(&mut self, config: LineConfig)
  • Semantics: Draws a single, straight line segment between two explicit points.
  • Implementation:
    • Vector: Native <line> primitive.
    • GPU Raster (WGPU): Expanded directly to an instanced quad in the WGSL vertex shader.
  • Performance: Minimal overhead. Bypasses traditional line width limitations.
  • Use Cases: Boxplot whiskers, axis ticks, grid strokes, error bars.
  1. draw_gradient_rect(&mut self, config: GradientRectConfig)
  • Semantics: Fills an axis-aligned rectangle with a seamless linear gradient.
  • Implementation:
    • Vector: Context-linked <linearGradient>.
    • GPU Raster (WGPU): Custom fragment shader computing interpolations directly on instanced quads.
  1. draw_polygon(&mut self, config: PolygonConfig)
  • Semantics: Renders a closed, strictly convex polygon used for high-performance internal area filling.
  • Implementation:
    • Vector: Closed <polygon> element.
    • GPU Raster (WGPU): Fast-path Triangle Fan hardware rasterization. Single pass, zero tessellation or stencil overhead.
  • Performance: Extremely fast, but relies on the CPU/Mark layer to guarantee the input is convex. Not intended for arbitrary or concave geo-polygons.
  • Use Cases:
    • Symmetric markers (Triangle, Diamond, Hexagon, Star).
    • Area Plots: The continuous concave area is sliced by the CPU into hundreds of perfect, convex trapezoids ($X_n \le X_{n+1}$) and fed sequentially to this fast-path filler.
  1. draw_path(&mut self, config: PathConfig)
  • Semantics: The universal topological router. Renders a continuous polyline or a complex area boundary. Its behavior is strictly dictated by the topology hint.
  • Implementation:
    • Vector: Continuous <path> or <polyline> element.
    • GPU Raster (WGPU): Acts as a dispatcher:
      • PathTopology::Simple -> draw_path_simple: Pure GPU normal extrusion. Strokes only, no fills. Used for smoothing chart boundaries and polylines.
      • PathTopology::Complex -> draw_path_complex: The heavy-duty Stencil-Then-Cover (STC) pipeline. Handles concave, self-intersecting, or holed polygons seamlessly using Odd-Even winding hardware passes.
  • Performance: Provides a zero-cost abstraction. Simple paths bypass expensive triangulation, while complex paths safely fallback to robust stencil hardware logic.
  • Use Cases: Continuous line marks, the top-highlighted edge of an area plot (Simple), geographic maps, and multi-layered complex boundaries (Complex).
  1. draw_text(&mut self, config: TextConfig)
  • Semantics: Registers formatted text with strict layout attributes (text-anchor, dominant-baseline) into a Deferred Text Ledger.
  • Implementation:
    • Vector (SVG/PDF): Native <text> nodes.
    • Desktop/Headless (WGPU + PNG): Text commands are collected and bypassed during the WGPU pass. After reading the WGPU rendered buffer back to the CPU, tiny-skia handles the typography compositing.
    • WASM/Web (WGPU + Canvas2D): The WGPU virtual surface renders the geometry. Text is deferred to an absolute-positioned HTML5 <canvas> overlay, utilizing the browser's highly optimized CanvasRenderingContext2D for pixel-perfect native font rendering.
  • Performance: Eliminates WebAssembly-to-JS font glyph memory roundtrips, avoids GPU atlas bloat, and guarantees pristine native text antialiasing across all operating systems.
  • Use Cases: Data labels, axis labels, titles, and legends.

Mark-to-Primitive Routing Rules

The real power of this design is how high-level declarative marks map to low-level primitives. This routing is deterministic, optimized, and identical across all backends:

Mark TypePrimary Primitive MappingBackend Execution Strategy
PointMarkdraw_circle, draw_rect, draw_polygonGPU Instanced SDF & Hardware Convex Fan.
LineMarkdraw_path (Topology::Simple)GPU On-Chip Vertex Extrusion (Stroke only).
AreaMarkdraw_polygon (Fill) + draw_path (Simple Stroke)CPU slices concave area into convex trapezoids for draw_polygon; top boundary routed to draw_path for clean extrusion.
BoxplotMarkdraw_rect, draw_line, draw_circleDecomposed cleanly into atomic backend shape calls.
BarMarkdraw_rectHighly optimized direct bounding-box rectangle fills.
Geographicdraw_path (Topology::Complex)STC (Stencil-Then-Cover) dual-pass GPU filling for irregular concave shapes with holes.
TextMarkdraw_text (Deferred Ledger)Geometry rendered via GPU, typography composited via host native environment (HTML5 Canvas 2D / tiny-skia).

WGPU Implementation Notes (For Backend Developers)

The WGPU backend (WgpuBackend) strictly adheres to a 4-Tier Geometry Architecture mixed with a Deferred Compositing pipeline:

  1. Tier 1 (SDF Instancing): circle_pipeline, rect_pipeline for mathematically perfect, zero-tessellation primitives.
  2. Tier 2 (Vertex Extrusion): draw_path_simple uses normal-expansion in WGSL to create resolution-independent thick lines.
  3. Tier 3 (Convex Fast-Path): draw_polygon relies on basic TriangleList/TriangleStrip topology for blindingly fast fills, assuming the CPU has provided convex indices.
  4. Tier 4 (STC Heavy Weapon): draw_path_complex utilizes wgpu::StencilState with StencilOperation::Invert (Odd-Even winding) to solve arbitrary geographic concave polygons directly on the GPU.

The Deferred Text Ledger:

To decouple heavy font-shaping from the GPU, render_primitive_only collects all TextConfig calls into a Vec<TextConfig>. This ledger is returned to the host runner (save_wgpu_png or render_to_canvas), which then safely overlays the text using the most optimal native 2D API available to the platform.

Hybrid GPU-Accelerated Geometry & Deferred Text Rendering Architecture

This document proposes a hybrid, high-performance rendering architecture for our visualization engine. To achieve ultra-high throughput and sub-millisecond interactivity for large-scale datasets, this architecture explicitly splits the rendering pipeline into two decoupled layers: Pure GPU Instancing for Geometric Primitives (bypassing CPU-bound vector tessellation like lyon) and a Zero-Allocation Deferred Ledger for Typography (leveraging target-native text engines like tiny_skia and Browser Canvas 2D).

By migrating complex geometry directly into specialized GPU instancing pipelines and reserving text layout for optimal target-specific layers, the engine eliminates heavy CPU preprocessing, avoids font-atlas memory bloat, minimizes WASM binary footprints, and delivers pixel-perfect text anti-aliasing.


Technical Blueprint: Mark Mapping to Low-Level Primitives

The core architecture maps high-level declarative graphics marks onto mathematically optimized low-level RenderBackend primitives, executed via decoupled native pathways:

Chart ElementLow-Level PrimitiveRendering Implementation BlueprintCore Architectural Advantage
Scatter Plotdraw_circle
draw_polygon
Pure GPU: Instanced SDF (PointData) & Template Vertex PipelinesZero CPU Overhead: Points are batched via Storage Buffers. Vertex expansions and circular SDF boundaries are calculated entirely on-chip, scaling to millions of markers at stable 60 FPS.
Line Chartdraw_line
(Polyline)
Pure GPU: WGSL Thick Line Shader (Dynamic Extrusion via Vertex ID)On-Chip Extrusion: Completely eliminates CPU-side polyline stroke/join calculations. The vertex shader dynamically expands segments into quads on the GPU, ensuring buttery-smooth web interactions.
Area Plotdraw_path
(Monotonic)
Pure GPU: Linear Triangulation CPU Stream / WGSL Instanced Ribbon StripsTrivial Topology: Statistical area bounds follow strict monotonic X/Y progressions with zero self-intersections. A fast memory pass or GPU strip eliminates complex spatial partitioning.
Map / Geodraw_path
(Complex)
Pure GPU: Ahead-Of-Time (AOT) Triangulation via earcutr Buffer CacheGIS Optimization: Complex geographical boundaries containing multi-layer holes/islands are tessellated exactly once during data ingestion, streaming static index buffers straight to the GPU.
Text & Labelsdraw_textDeferred Ledger: Captured via Vec<TextConfig> & Composited at Top-LevelTarget-Native White-Labeling: Completely strips typography logic out of WGPU. Text is collected via a zero-overhead CPU ledger, then rendered at the very top layer using target-optimized engines (tiny_skia / Canvas 2D).

The Typography Dilemma in Pure WGPU

During our core engineering iteration, rendering text natively inside wgpu (via crates like glyphon or manual Font Atlas caching) was rejected due to three fatal architectural drawbacks:

  1. WASM Binary & Memory Bloat: Embedding full font files (.ttf/.otf) or complex text-shaping engines into WebAssembly dramatically inflates the .wasm bundle size, destroying web-native startup times. Furthermore, managing font textures, glyph cache evictions, and UV mappings inside VRAM adds significant runtime complexity.
  2. Subpixel Anti-Aliasing Deficiencies: Custom GPU text shaders often struggle with subpixel rendering, leading to blurry, jagged, or poorly scaled labels on low-DPI monitors compared to mature operating system font renderers.
  3. Ecosystem Isolation: For game engines (like Bevy) or web apps embedding this library, a hardcoded WGPU text pipeline forces users into a closed ecosystem. It prevents charts from automatically inheriting the host application’s global UI font, scaling rules, and localization/accessibility features.
  4. Extreme Implementation Complexity & Diminishing Returns: Engineering a robust, production-grade text layout and rendering system within wgpu presents an exceptionally steep learning curve and massive development overhead. Our foundational iterations yielded only partial success and fell significantly short of our core visual expectations. The staggering effort required to manually handle layout bounds, vertical baseline alignment, and multi-language text shaping creates an unsustainable engineering bottleneck with severe diminishing returns.

The Solution: Hybrid Layered Compositing (Deferred Ledger Mode)

To circumvent these issues, WgpuBackend acts as a pure geometric powerhouse. When a draw_text call is triggered, the backend performs zero GPU allocations and creates no vertices. Instead, it acts as a lightweight "accountant," pushing the raw configurations into a serial ledger: pub collected_texts: Vec<TextConfig>.

Once the WGPU pipeline finishes flushing the geometric primitives to the target buffer, a top-level orchestrator takes over, processing the text ledger through a Target-Aware Dual Engine Pipeline:

[Declarative Chart Request]
         │
         ├──► Geometry ──► WgpuBackend (Instanced SDF / WGSL Shaders) ──► GPU Target Surface
         │                                                                      │
         └──► Text      ──► Collected Memory Ledger (Vec<TextConfig>)           │  (Compositing)
                                       │                                        ▼
                        ┌──────────────┴──────────────┐                 ┌──────────────┐
                        ▼                             ▼                 │              │
               [Desktop / Headless]             [WASM / Web]            │              │
                        │                             │                 │              │
                  (tiny_skia)                 (Canvas 2D Context)       │              │
                        │                             │                 │              │
                        ▼                             ▼                 ▼              ▼
               Stamp Text on Bitmaps          Ctx.fill_text Overlays ──►[Final Visual Output]

1. WebAssembly Environment (WASM / Browser Target)

  • Execution: wgpu renders the grid lines, ticks, and geometric marks directly onto a WebGL2 or WebGPU <canvas>.
  • Text Layer: The engine acquires the native browser's CanvasRenderingContext2d on a transparent top-layer canvas or overlays HTML/Canvas text directly. It loops through collected_texts and invokes native ctx.fill_text(...).
  • Advantage: Bypasses the entire text rendering stack. It leverages the browser’s highly optimized, hardware-accelerated typography engine for free, gaining perfect anti-aliasing, layout shaping, internationalization (i18n), and seamless integration with standard web fonts.

2. Desktop / Headless Environment (Local PNG Export)

  • Execution: wgpu renders the high-density geometric marks into an off-screen texture buffer, which is then copied back to a CPU pixel array.
  • Text Layer: The engine initializes a tiny_skia raster backend over the pixel buffer. It iterates over the collected_texts ledger and stamps the text labels cleanly on top of the WGPU-generated geometric base image.
  • Advantage: Retains full standalone rendering capability for server-side automated reports, command-line interfaces, and unit testing without requiring an active browser or UI windowing subsystem.

Alternative Evaluation: The Case for lyon

(This section remains retained as our core justification for choosing custom shaders over CPU vector solvers for geometry processing.)

Where lyon Excels

lyon is an exceptionally robust framework designed primarily for arbitrary, unstructured, and unpredictable 2D vector graphics, such as:

  • SVG Engines / Web Browsers: Processing arbitrary vector source files filled with nested Bézier curves and dynamic stroke-joins where geometry is completely unknown prior to runtime.
  • Vector Design Tools (e.g., Figma clones): Where users freely draw, twist, and overlap vector paths.

Why We Reject Tessellation for Geometry

For a specialized data visualization engine, utilizing a general-purpose vector library introduces severe structural inefficiencies:

  • Main-Thread Bottlenecks: Data visualization deals with highly structured, mathematically uniform data (e.g., uniform scatter markers, layout-confined bar quads). Utilizing a comprehensive vector solver to parse these highly predictable primitives introduces massive, redundant CPU branching, choking single-threaded WebAssembly environments.
  • Embracing Hardware Parallelism: By rejecting CPU-side preprocessing, we transfer tasks (such as thick line expansions and marker geometry creation) directly into parallelized GPU processing cores via custom WGSL shaders. The CPU is liberated, acting purely as a rapid data pipe streaming raw structured buffers straight into VRAM.

Conclusion

By implementing this Hybrid Layered Architecture, our system achieves the best of both worlds: uncompromised, chart-dropping GPU acceleration for geometric data points, and zero-cost, ultra-crisp, flexible typography via native host engines. The resulting pipeline is remarkably lean, deterministic, and tailored for fluid interactivity under extreme industrial data densities.

The Atomic Unit: ColumnVector

At the heart of Charton's performance lies the ColumnVector. While most visualization libraries treat data as a collection of loose objects or rows, Charton adopts a Columnar Memory Layout. This architecture is inspired by Apache Arrow and Polars, ensuring that data is stored in contiguous memory blocks for CPU cache efficiency and potential SIMD acceleration.

The Anatomy of a Column

A ColumnVector is a specialized enum that encapsulates data types relevant to data science and visualization. Every variant (except for those with intrinsic null representation) follows a dual-structure:

  1. Data Buffer: A Vec<T> containing the raw physical values.
  2. Validity Bitmask: An Option<Vec>` where each bit represents whether a row is "Valid" (1) or "Null" (0).

The Categorical Advantage

One of the most important types for visualization is Categorical. Instead of storing repetitive strings (like "Group A", "Group A"...), it stores u32 keys pointing to a unique dictionary of values. This is essential for rendering large datasets with repetitive labels while keeping memory usage flat.

Manual Construction

Charton provides high-level constructors to turn various Rust string collections into memory-efficient categorical columns automatically.

1. From Raw Strings (No Nulls)

If your data is complete, you can pass collections of String or &str directly. Charton will handle the deduplication and dictionary encoding.

#![allow(unused)]
fn main() {
// Supporting Vec<&str> or Vec<String>
let cities = vec!["London", "Paris", "London", "Tokyo"];
let col = ColumnVector::from_str_as_cat(cities);
}

2. From Optional Strings (With Null Support)

For datasets with missing values, use from_str_as_cat_opt. This version automatically builds the internal Validity Bitmask.

#![allow(unused)]
fn main() {
// Supporting Vec<Option<&str>> or Vec<Option<String>>
let status = vec![Some("High"), None, Some("Low"), Some("High")];
let col = ColumnVector::from_str_as_cat_opt(status);
}

3. Why Use Categorical?

  • Memory Efficiency: 1 million rows of "Male"/"Female" takes ~1MB as Categorical, compared to ~20MB+ as raw String.
  • Encoding Ready: The underlying u32 keys are used directly by Charton's color scales and legend generators.

Why this Layout ?

  • Polars-Friendly: The variants map 1:1 to Polars DataTypes, allowing for near zero-cost ingestion from Polars DataFrames via the load_polars_df! macro.
  • Wasm-Ready: By preserving narrow types like Int8, Charton minimizes memory footprint in memory-constrained WebAssembly environments.
  • Zero-Abstraction Temporal Data: Time data is stored as raw i64 integers, allowing coordinate arithmetic without the cost of high-level object wrapping.

Full Type Mapping Reference

Charton VariantPhysical StoragePolars EquivalentBest Use Case
BooleanboolBooleanBinary flags, True/False categories.
Int8 / Int16i8 / i16Int8 / Int16Memory-efficient small integers (e.g., months).
Int32 / Int64i32 / i64Int32 / Int64General purpose integers or primary IDs.
UInt32u32UInt32Array indices or internal dictionary keys.
UInt64u64UInt64Large hashes or 64-bit unique identifiers.
Float32f32Float32Memory-efficient coordinates for high-density plots.
Float64f64Float64The Standard for most coordinate and value axes.
StringStringString / Utf8Unique labels or long descriptions.
Categoricalu32 Keys + String DictCategorical / EnumRecommended for Legends, Colors, and repeated labels.
Datei32 (days since epoch)DateCalendar-based timelines.
Datetimei64 + TimeUnitDatetimeTime-series data with sub-second precision.
Durationi64 + TimeUnitDurationTime deltas or Gantt chart intervals.
Timei64 (nanos since midnight)TimeDaily cycles and clock-time analysis.

The Temporal Engine: High-Fidelity Data Model

In Charton, time is more than just a label—it's a high-performance coordinate system. By adopting a "Data-as-Truth" philosophy, Charton prioritizes the preservation of raw input signals over invasive normalization, ensuring maximum precision from ingestion to rendering.

1. The Physical Blueprint: Hardware-Native Storage

Charton "physicalizes" temporal data into primitive integers. This architecture ensures 100% compatibility with the Polars/Arrow ecosystem and enables SIMD (Single Instruction, Multiple Data) acceleration for coordinate calculations.

Semantic TypePhysical StorageDefault UnitLogic & Fidelity
Datetimei64NanosecondsAbsolute UTC-based points; captures full OffsetDateTime precision.
Datei32DaysCalendar intervals (Epoch Days); optimized for memory (4 bytes/row).
Timei64NanosecondsIntra-day offset from Midnight; preserves sub-microsecond event ticks.
Durationi64NanosecondsRelative spans; maintains mathematical symmetry with Datetime.

2. Core Philosophy: Data as the "Gold Standard"

2.1 Precision-First Ingestion

Instead of forcing data into a lossy floating-point representation or a system-defined reference point during loading, Charton treats the user's original values as immutable:

  • Integer Domain Residency: Data stays in the i64/i32 domain as long as possible. This avoids the Floating-Point Precision Trap, where f64 loses nanosecond-level resolution when representing large Unix timestamps (the "big number noise" problem).
  • Zero-Copy Potential: By matching the memory layout of modern data frames, Charton can map raw buffers directly into ColumnVector variants with zero re-sampling or multiplication overhead.
  • Maximum Fidelity: When converting from high-level objects like time::OffsetDateTime, Charton automatically extracts nanosecond-level integers, ensuring not a single bit of precision is discarded.

2.2 Late-Binding Projection

The conversion to visual coordinates (f64) is deferred until the last possible moment—the Scaling Stage:

  1. Direct Mapping: Scales operate directly on integer slices for range (min/max) detection.
  2. On-Demand Normalization: Conversion to f64 happens only when calculating pixel positions. This "Late-Binding" approach ensures that even at extreme zoom levels, coordinates are derived from the highest-resolution data available.

3. The Scaling Bridge: Semantic Metadata

Charton uses TimeUnit not as a conversion target, but as Metadata that describes the inherent scale of the underlying integers.

UnitScaling Factor ($val \rightarrow s$)Practical Application
Days86400.0Macro: Geological eras, historical records, and daily business logs.
Seconds1.0Standard: General IoT telemetry and basic event logging.
Millis1e-3Web: Seamless synchronization with JavaScript Date.now().
Nanos1e-9Micro: High-frequency trading (HFT) and sub-atomic event profiling.

Semantic Intelligence

Retaining the original semantic variant (e.g., Date vs Datetime) allows the engine to make intelligent UI decisions:

  • Adaptive Tick Generation: A Date column automatically aligns its axis ticks to human-friendly day/month boundaries.
  • Unit-Aware Formatting: The system knows a Duration represents a span (e.g., "+30s") while a Time represents a specific clock point (e.g., "14:00:30").

4. Ecosystem Synergy: Rust Data Stack

Charton is designed as the visual extension of the modern Rust data ecosystem.

  • Polars & Arrow: Direct ingestion of primitive buffers, respecting the TimeUnit and TimeZone metadata defined in the schema.
  • Time Crate Integration: Native From implementations for OffsetDateTime, Date, and Time.
  • Memory Efficiency: By using Arc<ColumnVector> within a Dataset, Charton enables zero-copy data sharing across multiple threads, layers, and viewports.

5. Performance Layer

  • SIMD Acceleration: Continuous memory layout allows the CPU to process temporal filters, range checks, and projections in parallel batches.
  • Validity Bitmasks: Charton uses an independent bitmask to handle Null values. This eliminates the need for "Sentinel Values" (like 0 or -1) which could be confused with actual epoch timestamps.
  • Thread-Safe Concurrency: Arc-wrapped columns allow for simultaneous rendering of different views (e.g., a main chart and an overview minimap) without memory contention.

6. Summary: Fidelity Without Compromise

By moving away from intrusive pre-processing and adopting a High-Fidelity Integer Model, Charton achieves a critical balance: it is robust enough to hold the history of the universe in days, yet sharp enough to distinguish the individual ticks of a nanosecond-level signal—all while maintaining the absolute integrity of the user's original data.

The Dataset: High-Performance Data Container

The Dataset is the primary unit of data movement in Charton. It is a column-oriented container designed for high-performance visualization, thread safety, and zero-copy data sharing.

Internal Architecture

A Dataset manages a collection of ColumnVectors using a schema-based lookup. Its design focuses on three core principles:

  1. Columnar Layout: Data is stored as a Vec<Arc<ColumnVector>>. Using Arc allows multiple parts of a visualization (e.g., different chart layers) to share the same data without duplication.
  2. Schema Integrity: A Dataset ensures all columns have identical row counts (row_count), preventing out-of-bounds errors during rendering.
  3. Fast Lookup: An AHashMap maps column names to their physical index in the column vector for $O(1)$ access.
#![allow(unused)]
fn main() {
#[derive(Clone, Default)]
pub struct Dataset {
    /// Maps column names to their index in the `columns` vector.
    pub(crate) schema: AHashMap<String, usize>,
    /// Arc-wrapped columns for zero-copy sharing and threading safety.
    pub(crate) columns: Vec<Arc<ColumnVector>>,
    /// Total row count. Must be consistent across all columns.
    pub(crate) row_count: usize,
}
}

Construction Methods

Charton provides multiple ways to ingest data, catering to different logic flows—from static configurations to dynamic processing.

1. Fluent / Builder Style

Best for static declarations or building datasets without mut variables. Each call to with_column consumes and returns the Dataset.

#![allow(unused)]
fn main() {
let ds = Dataset::new()
    .with_column("x", vec![10.0, 20.0, 30.0])?
    .with_column("y", vec![Some(100i64), None, Some(300i64)])?
    .with_column("category", vec!["A", "B", "C"])?;
}

2. Imperative Style

Ideal for dynamic logic or loops where you only have a mutable reference (&mut self) to the dataset.

#![allow(unused)]
fn main() {
let mut ds = Dataset::new();
let sepal_length = vec![5.1, 4.9, 4.7, 4.6, 5.0];
let species = vec![Some("Iris-setosa"), None, None, None, Some("Iris-virginica")];
ds.add_column("sepal_length", sepal_length)?;
ds.add_column("species", species)?;
}

3. Collection Conversion (ToDataset Trait)

The most idiomatic way to perform bulk ingestion from key-value pairs (vectors of tuples).

#![allow(unused)]
fn main() {
let raw_data = vec![
    ("mpg", vec![18, 15, 18].into_column()),
    ("car_name", vec!["chevrolet", "buick", "plymouth"].into_column()),
];

let ds = raw_data.to_dataset()?;
}

Example

To ensure full compatibility with diverse workflows, Dataset can hold numerical, categorical, and temporal types simultaneously. Below is a 5-row example demonstrating every supported category using the time crate.

use charton::prelude::*;
use time::{Date, Duration, Month};
use time::macros::datetime;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let complex_data = vec![
        // 1. Numerical & Boolean
        ("id", vec![1u64, 2, 3, 4, 5].into_column()),
        ("active", vec![true, true, false, true, false].into_column()),
        ("score", vec![Some(95.5), Some(88.0), None, Some(76.2), Some(91.0)].into_column()),
        
        // 2. Categorical (Dictionary Encoded)
        ("group", ColumnVector::from_str_as_cat(
            vec!["High", "Low", "High", "Medium", "Low"]
        )),
        
        // 3. Raw Strings (Unique Labels)
        ("label", vec!["Alpha", "Beta", "Gamma", "Delta", "Epsilon"].into_column()),
        
        // 4. Temporal: Datetime & Date
        // Using time::macros::datetime! is the standard, idiomatic way 
        // to create OffsetDateTime instances in Rust code.
        ("timestamp", vec![
            datetime!(2026-05-01 00:00 UTC),
            datetime!(2026-05-01 12:00 UTC),
            datetime!(2026-05-02 00:00 UTC),
            datetime!(2026-05-02 12:00 UTC),
            datetime!(2026-05-03 00:00 UTC),
        ].into_column()),
        
        // Using Date::from_calendar_date is the standard safe constructor for Dates.
        ("date", vec![
            Date::from_calendar_date(2026, Month::May, 1)?,
            Date::from_calendar_date(2026, Month::May, 2)?,
            Date::from_calendar_date(2026, Month::May, 3)?,
            Date::from_calendar_date(2026, Month::May, 4)?,
            Date::from_calendar_date(2026, Month::May, 5)?,
        ].into_column()),
        
        // 5. Duration (Time Deltas)
        // Using Duration::seconds is the standard constructor.
        ("lead_time", vec![
            Duration::seconds(100),
            Duration::seconds(250),
            Duration::seconds(500),
            Duration::seconds(750),
            Duration::seconds(1000),
        ].into_column()),
    ];

    let ds = complex_data.to_dataset()?;

    println!("{:?}", ds);
    
    Ok(())
}

Note: The above example uses the time crate with features parsing and macros on for temporal types.

Core API Reference

Inspection

  • height() -> usize: Returns the number of rows.
  • width() -> usize: Returns the number of columns.
  • get_column_names() -> Vec<String>: Returns names in their insertion order.
  • is_null(name, row) -> bool: Checks if a specific cell is null (handles both NaN and validity bitmasks).

Data Access

  • column(name) -> Result<&ColumnVector>: Access the column wrapper to inspect metadata (units, validity).

  • get_column<T>(name) -> Result<&[T]>: High-performance access to the underlying physical slice.

    • Note: For temporal types, this returns the raw i64 slice.

Slicing (Zero-Copy)

Charton uses "Eager Slicing." Because columns are wrapped in Arc, these operations are extremely lightweight and do not copy the underlying data buffers.

  • head(n): Returns a new Dataset containing the first n rows.
  • tail(n): Returns a new Dataset containing the last n rows.
  • slice(offset, len): Returns a new Dataset starting at offset with len rows.

Debugging: The Tabular View

Printing the Dataset via {:?} renders a clean, aligned table with type markers.

Dataset View: rows 0..5 (Total 5 rows)
id   | active| score  | group | label  | timestamp           | date      | lead_time     
(u64)| (bool)| (f64)  | (cat) | (str)  | (datetime[ns])      | (date)    | (duration[ns])
-----------------------------------------------------------------------------------------
1    | true  | 95.5000| High  | Alpha  | 2026-05-01T00:00:00Z| 2026-05-01| 100000000000  
2    | true  | 88.0000| Low   | Beta   | 2026-05-01T12:00:00Z| 2026-05-02| 250000000000  
3    | false | null   | High  | Gamma  | 2026-05-02T00:00:00Z| 2026-05-03| 500000000000  
4    | true  | 76.2000| Medium| Delta  | 2026-05-02T12:00:00Z| 2026-05-04| 750000000000  
5    | false | 91.0000| Low   | Epsilon| 2026-05-03T00:00:00Z| 2026-05-05| 1000000000000 

Data Ingestion & Interop

Charton is positioned as the "Rendering Layer" in the Rust data science ecosystem. To provide the most efficient workflow, Charton offers deep integration with Polars, the de facto standard for high-performance DataFrames in Rust.

Data Pipeline Overview

In a typical Charton application, data flows through the following pipeline:

  1. External Input: Load raw data from CSV, Parquet, JSON, or SQL databases.
  2. Polars Processing: Utilize Polars' Lazy engine for filtering, joins, aggregations, or feature engineering.
  3. Charton Conversion: Convert the processed polars::DataFrame into a charton::Dataset.
  4. Visualization: Pass the Dataset to Charton's rendering engine.

Polars Integration

Charton uses the load_polars_df!() macro to convert a Polars DataFrame into a Charton Dataset.

use charton::prelude::*;
use polars::prelude::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create DataFrame with diverse types, including native Polars temporal types
    let df = df!(
        "id" => &[1, 2, 3, 4, 5],
        "status" => &["High", "Low", "High", "Medium", "Low"],
        "value" => &[Some(1.2), None, Some(5.6), Some(7.8), None],
        "date" => Series::new("date".into(), &[19858i32, 19859, 19860, 19861, 19862]).cast(&DataType::Date)?, // ~2024-05-15
        "datetime" => Series::new("datetime".into(), &[1715760000000i64, 1715763600000, 1715767200000, 1715770800000, 1715774400000])
            .cast(&DataType::Datetime(TimeUnit::Milliseconds, None))?,
        "duration" => Series::new("duration".into(), &[3_600_000i64, 7_200_000, 1_800_000, 10_800_000, 5_400_000])
            .cast(&DataType::Duration(TimeUnit::Milliseconds))?,
    )?;

    // Conversion to Charton dataset
    let ds = load_polars_df!(df)?;
    println!("{:?}", ds);

    Ok(())
}

Type Mapping & Metadata Preservation

Charton ensures strict metadata alignment during conversion. The following table illustrates how Polars logical types map to Charton physical storage:

Polars Logical TypeCharton Physical TypeNotes
Int8, Int16, Int32, Int64i8, i16, i32, i64Direct physical mapping.
UInt32, UInt64u32, u64Direct physical mapping.
Float32, Float64f32, f64NaN values are treated as Nulls.
BooleanboolMapped to nullable boolean vector.
Utf8 / StringStringStored as nullable string vectors.
Categorical(_, _), Enum(_, _)CategoricalPreserves dictionary encoding + validity.
DateDateStored as i32 days since Unix epoch.
TimeTimeStored as i64 nanoseconds since midnight.
Datetime(unit, _)DatetimeNormalized to i64 nanoseconds since Unix epoch.
Duration(unit)DurationNormalized to i64 nanoseconds.

Note: Categorical does not appear to be a primitive type in rust Polars.

Encodings & Channels

In Charton, Encoding is the bridge between the "Data World" and the "Visual World." It defines how a specific column (Dimension) in your dataset is transformed into a visual property that a human can perceive.

What are Visual Channels?

A visual channel is a physical property of a graphic that can carry information. Based on the implementation in encode.rs, Charton supports the following core channels:

  • Position Channels: X and Y. These are the most powerful channels, used to represent numerical magnitude or categorical ordering.
  • Aesthetic Channels:
    • Color: Used for distinguishing categories (Palettes) or representing gradients (Continuous maps).
    • Shape: Used exclusively for categorical data, using different geometric primitives (Circle, Square, Triangle).
    • Size: Typically mapped to point radii or line stroke widths to represent weight or importance.
  • Support Channels: Text (for labels) and Y2 (used as a baseline for area charts or interval bars).

Mapping Data to Aesthetics

Encoding is a Semantic Declaration. When you write the following code:

#![allow(unused)]
fn main() {
chart.encode((
    x("timestamp"),
    y("temperature"),
    color("city")
))
}

You are instructing Charton's engine to:

  1. Extract: Retrieve the "timestamp," "temperature," and "city" columns from the dataset.
  2. Assign: Bind these columns to the horizontal position, vertical position, and color hue, respectively.
  3. Infer: The system automatically detects the data types (e.g., Temporal for time, Linear for temperature, Discrete for city) to choose the correct mapping logic.

The Encoding Architecture

Under the hood in encode.rs, all encoding requests are consolidated into a central Encoding container:

#![allow(unused)]
fn main() {
pub struct Encoding {
    pub(crate) x: Option<X>,
    pub(crate) y: Option<Y>,
    pub(crate) color: Option<Color>,
    pub(crate) shape: Option<Shape>,
    pub(crate) size: Option<Size>,
    // ... other channels
}
}

Each channel (e.g., Color) stores the Field name and a ResolvedScale. In the "Arbitration Phase" of the chart lifecycle, the engine injects a concrete mathematical scale into this placeholder based on the global data boundaries.

Global Aesthetic Consistency

A key design goal of Charton is Global Aesthetics. As demonstrated in aesthetics.rs, if multiple layers use the same field for the same channel (e.g., both a Point layer and a Line layer use color("species")), the GlobalAesthetics orchestrator ensures:

  1. Uniformity: The "Species A" category will have the exact same hex color in every layer.
  2. Consolidation: Only one unified Legend is generated for that field, preventing visual clutter and conflicting guides.

Tuple-Based Expressiveness

To streamline the developer experience, Charton utilizes Rust macros to implement IntoEncoding for tuples. This allows you to define multiple channels in a single, concise call, making the code highly readable and expressive.

Key Takeaways

  • Declarative Intent: You define "what" maps to "where," not "how" to calculate pixels.
  • Channel Hierarchy: Position channels (X/Y) are prioritized for accuracy, while aesthetics (Color/Shape) are used for grouping.
  • Lazy Resolution: Encodings store the "Intent"; the actual pixel values are calculated only during the realization phase.

Scales & Domains

If Encoding defines which data fields are connected to which visual channels, then the Scale defines how that connection is mathematically calculated. A Scale is a function that maps values from a Data Domain to a Visual Range.

Core Concepts: Domain and Range

Every Scale operates between two distinct spaces:

  1. Domain: The state of the data. For example, a temperature range in your dataset $[0^\circ\text{C}, 100^\circ\text{C}]$ or a set of categories like ["Apple", "Banana", "Orange"].
  2. Range: The state of the visual output. For an axis, this is typically physical pixels (e.g., $[0, 800]$); for a color encoding, it is a sequence of colors in a palette.

Scale Types

Charton provides specialized scale types based on the nature of the underlying data (Quantitative, Categorical, or Temporal):

Linear Scale

The most common scale for quantitative data. It preserves the original proportional relationships between data points using a linear transformation ($y = ax + b$).

  • Best for: Prices, lengths, weights, and counts.
  • Expansion: To prevent data marks (like scatter points) from being cut off at the edges, Charton automatically applies a $5%$ padding to both ends of the Domain by default.

Logarithmic (Log) Scale

When data spans multiple orders of magnitude (e.g., $1$ to $1,000,000$), a linear scale compresses small values. A Log Scale applies a logarithmic transformation, ensuring that each order of magnitude occupies an equal visual weight.

  • Note: The Domain of a Log Scale cannot contain zero or negative numbers.

Discrete (Ordinal) Scale

Used for categorical or ranked data. Unlike continuous scales, it partitions the visual range into discrete "slots" or "bins."

  • Best for: Country names, product categories, or ratings (e.g., Poor/Fair/Good).
  • Stability: The system uses stable sorting for categories to ensure that the order of items remains consistent across multiple renders.

Temporal Scale

Specifically designed for dates and times. It understands the irregular spans of days, hours, and minutes, and automatically generates human-readable axis ticks (e.g., showing months instead of raw timestamps).

Ticks and Label Generation

Scales do more than just map positions; they communicate meaning through Guides.

  • Tick Generation: The system calculates "pretty" numbers for axis marks. If your data range is $[0, 93]$, the scale will intelligently choose $[0, 20, 40, 60, 80, 100]$ as ticks rather than arbitrary values.
  • Formatting: Support for scientific notation, currency symbols, percentages, and custom date-time formats.

Scale Arbitration

In a multi-layered chart, Scales act as a "Single Source of Truth." When different layers share an axis, Charton performs Arbitration:

  1. Scanning: The engine identifies the "Global Union" of all data domains across all layers.
  2. Unification: A single, unified scale is created that is large enough to encompass the data from every layer.
  3. Distribution: This unified scale is injected back into each layer, ensuring that they are all drawn within the same mathematical coordinate system.

Manual Overrides

While Charton automates most scaling logic, you retain full control through explicit overrides:

  • Force Zero: Ensure an axis starts at $0$ even if the minimum data point is much higher.
  • Fixed Intervals: Fix a percentage axis strictly between $[0, 1]$ to prevent it from auto-scaling based on a subset of data.

Key Takeaways

  • Domain is data; Range is pixels or colors.
  • Scale Type determines the "visual density" (Linear vs. Log).
  • Automatic Arbitration ensures that composite layers are perfectly aligned.
  • The Tick System translates abstract math into readable information.

Marks & Geometries

While Encodings and Scales define the mathematical relationship between data and space, the Mark is the physical manifestation of that relationship. A Mark is the geometric primitive used to represent a data point or a set of data points.

The Role of a Mark

In Charton, a Mark is not just a drawing instruction; it is a Template that knows how to interpret resolved aesthetic values (pixels, hex codes, shapes) into final geometry.

As defined in the Mark trait, every mark type in Charton:

  1. Identifies itself: Each mark has a unique string identifier (e.g., "point", "bar").
  2. Provides Defaults: Marks define fallback values for properties like stroke, opacity, and shape if they are not explicitly mapped to data.
  3. Determines Rendering Logic: Different marks require different drawing strategies—a Point is a single coordinate, while an Area is a complex polygon.

Common Mark Types

Charton provides a rich set of marks to cover various visualization needs:

Point Mark (mark_point)

The simplest mark, representing each data row as an individual geometric shape.

  • Dimensions: Primarily uses X and Y.
  • Aesthetics: Heavily utilizes Shape, Size, and Color.
  • Use Case: Scatter plots and bubble charts.

Line Mark (mark_line)

Connects data points in a specific order (usually by the X-axis) to show trends.

  • Connectivity: Unlike points, the Line mark treats a sequence of rows as a single continuous path.
  • Visuals: Focuses on stroke_width and color.

Bar Mark (mark_bar)

Represents data as rectangles extending from a baseline.

  • Physicality: Bars have "width." Charton calculates this width based on the CoordLayout (Chapter 1.4) to ensure bars don't overlap unless intended.
  • Intervals: Uses X, Y (height), and sometimes Y2 (for ranged bars).

Area Mark (mark_area)

Similar to a line but filled between a baseline (Y2) and the data value (Y).

  • Topology: Highlighting the volume between two series or between a series and the zero-axis.

Specialized Marks

  • Rule & Tick: Used for annotations or error margins.
  • Rect: Drawing arbitrary rectangles based on coordinate pairs.
  • Text: Placing strings directly into the coordinate space.

From Mark to Geometry: The Renderer

Behind every Mark lies a corresponding Renderer. When Charton enters the "Realization" phase (Chapter 3.4), it translates the mark's configuration into physical geometry:

  • PointElement: A simple struct containing x, y, shape, size.
  • PathConfig: A collection of points and stroke properties used for Lines and Areas.
  • RectConfig: Defined by x, y, width, height for Bars and Histograms.

Marks and Categorical Stacking

One of Charton's advanced features is how Marks handle Stacking and Grouping.

As seen in the MarkBar implementation, when multiple series exist on the same X-coordinate:

  • Stacked: The Y value of the second mark starts at the Y end-point of the first.
  • Grouped (Side-by-Side): The X position is offset by a fraction of the "Slot width," ensuring bars are placed next to each other without manual coordinate calculation.

Visual Consistency (The Mark Trait)

In mark.rs, the Mark trait ensures that all geometric primitives share a common interface. This allows the LayeredChart to treat a PointChart and a LineChart identically during the high-level orchestration phase, even though their low-level draw calls are completely different.

Key Takeaways

  • Marks are the "ink" on the page.
  • Mark choice changes the narrative of the data (e.g., a Line implies a trend, while a Bar implies a comparison).
  • Geometric resolution is the final step where abstract scales are converted into physical shapes (Circles, Rects, Paths).

Coordinate Systems

If Scales define the mathematical mapping of data, the Coordinate System defines the physical space where that mapping is realized. It determines how the $(x, y)$ pairs from our scales are positioned and distorted on the canvas to create different types of visualizations.

The Role of Coordinates

The coordinate system is the final arbiter of geometry. It takes normalized values (from 0 to 1) and translates them into physical coordinates. Its responsibilities include:

  • Spatial Translation: Converting abstract ratios into pixel positions.
  • Axis Orientation: Deciding if the X-axis runs horizontally or wraps around a circle.
  • Visual Transformation: Handling effects like "Coord Flip" (swapping X and Y) or polar projections.
  • Clipping: Determining if data points that fall outside the defined boundaries should be hidden.

Cartesian Coordinates (Cartesian2D)

The Cartesian system is the most common coordinate system, mapping data onto a rectangular plane.

  • Linear Mapping: It directly maps normalized X to width and normalized Y to height.
  • Coordinate Flipping: Charton supports a "Flipped" state. When enabled, the X-axis becomes vertical and the Y-axis becomes horizontal. This is a powerful way to turn a vertical Bar Chart into a horizontal one without changing the underlying data encoding.
  • Layout Hints: The Cartesian system provides specific hints to the layout engine, such as suggesting that bars should occupy 50% of their available slot width by default to ensure readability.

Polar Coordinates (Polar)

The Polar coordinate system maps data into a circular space, which is essential for radial visualizations like Pie charts, Donut charts, and Rose plots.

  • Angle and Radius:
    • The X dimension is mapped to the Angle (theta), typically spanning from $0$ to $2\pi$.
    • The Y dimension is mapped to the Radius (r), extending from the center to the outer edge.
  • Angular Customization: Users can define the start_angle (e.g., starting from the 12 o'clock position) and the total end_angle for partial circular plots.
  • Donut Configurations: By adjusting the inner_radius, the system can create a hole in the center, transforming a standard radial plot into a donut-style visualization.

The Coordinate Pipeline

Regardless of the system chosen, Charton follows a unified interface for rendering:

  1. Normalization: Data is first converted to a $[0, 1]$ range by the Scales.
  2. Transformation: The Coordinate System takes these ratios. For example, in a Polar system, it calculates $x = r \cdot \cos(\theta)$ and $y = r \cdot \sin(\theta)$.
  3. Canvas Mapping: The resulting points are scaled and shifted to fit within the PanelContext—the actual physical rectangle reserved for the plot.

8.5 Coordinate-Driven Layouts

Different coordinate systems require different aesthetic defaults. As specified in the layout logic:

  • Cartesian Bars often have gaps between groups to distinguish categories.
  • Polar Sectors (like in a Pie chart) often have zero spacing by default to maintain a solid, continuous circular shape.

The Coordinate System communicates these "Layout Hints" to the Marks, ensuring that the visualization looks correct out-of-the-box, whether it's a grid-based scatter plot or a radial wind rose.


Key Takeaways

  • Cartesian is for rectangular grids; Polar is for circular/radial views.
  • Coordinate Flipping allows for easy orientation changes (Horizontal vs. Vertical).
  • Coordinate Systems are the final step in the transformation pipeline, converting mathematical ratios into physical geometry.

Layering Grammar

A single geometric mark can rarely tell a complete data story. To compare raw observations against a mathematical baseline, or to display standard error bars over a bar chart, we need the ability to combine multiple visual elements. In the Grammar of Graphics, this is achieved through Layering.

Layering allows you to stack multiple independent plots on top of each other within a shared spatial and mathematical plane, creating sophisticated, multi-valued visualizations.

The Layer as an Independent Declarative Unit

In the system architecture, a layer is encapsulated as an independent specification. Each layer maintains its own isolated context:

  • Isolated Dataset: A specific Polars DataFrame.
  • Local Encodings: Individual specifications mapping its data fields to visual channels (e.g., Layer A maps X and Y, while Layer B maps X, Y, and Color).
  • Geometric Mark: The distinct renderer (e.g., Point, Line, Bar) that dictates its final drawing primitive.

When you compose a layered chart, you do not perform database joins or manual array concatenations. Instead, you build an ordered vector of these independent logical units:

#![allow(unused)]
fn main() {
// Composing a layered chart by stacking a Line trend over a Point scatter
let composite_chart = LayeredChart::new()
    .add_layer(Chart::new(df_points).mark_point().encode((x("age"), y("height"))))
    .add_layer(Chart::new(df_trend).mark_line().encode((x("age"), y("predicted_height"))));
}

Shared Context vs. Local Overrides

A fundamental challenge in multi-layer rendering is balancing global consistency with local flexibility. The orchestrator divides constraints into two categories:

Shared Global Constraints

By default, all layers in a composite chart share a singular Coordinate System (e.g., a shared Cartesian plane) and a unified set of Aesthetic Scales. If both a scatter layer and a line layer map a categorical field named "species" to the Color channel, they are strictly bound to the exact same color palette.

Local Scale Preferences

Despite sharing the same axis, an individual layer can request distinct structural parameters. For example, Layer A might explicitly request a Linear scale transformation, while Layer B requests a Log scale for its specific value range.

The global orchestrator evaluates these conflicting requests during the initialization phase, using deterministic hierarchy rules to resolve them into a single mathematical source of truth.

State Injection & Interior Mutability

Because layers are entirely decoupled when declared by the user, the engine must bridge the gap between abstract global scales and localized rendering. This is achieved via a dedicated Injection Phase utilizing Rust's interior mutability patterns.

  1. The Shared Pointer Channel: The global coordinate system and aesthetic mappings are compiled into thread-safe shared pointers (Arc<dyn CoordinateTrait> and GlobalAesthetics).
  2. Back-Filling via Interior Mutability: The orchestrator traverses the layer vector and invokes .inject_resolved_scales(...).
  3. Caching the Scales: Because layers are passed around by shared reference or standard borrowing during the high-level coordination phase, they internally utilize mutation safe-guards (such as RwLock or OnceLock) to cache these global scales.

Once back-filling is complete, every layer possesses immediate, zero-copy access to the final canvas boundaries and visual mappers.

The Realization Pipeline & Render Order

When the chart transitions to physical drawing, execution follows a strict sequential pipeline within a localized PanelContext:

  • Z-Index Execution: The engine iterates through the resolved layers in the exact order they were registered by the user (First In, First Out). The first layer added forms the background foundation, while subsequent layers are painted directly on top.
  • Stateless Mark Rendering: During this phase, individual geometric marks are entirely stateless. They do not know—nor do they need to know—if they are rendering alone or alongside ten other layers. They simply read their local data arrays, map them through the injected global coordinate scales, and emit concrete primitives (Circles, Paths, Rects) to the active RenderBackend.

Automated Legend Unification

Layering does not just unify spatial axes; it also unifies visual guides. If multiple layers share an identical aesthetic mapping (e.g., both a Point mark and an Area mark map the color channel to a field named "group"):

  • Semantic Deduction: The guide engine detects the shared field name across the global aesthetic registry.
  • Consolidation: Instead of rendering two separate legends that compete for canvas real estate, the system merges the structural definitions into a single, comprehensive Legend Spec.
  • Space Preservation: This consolidated specification is handed to the layout manager, ensuring that margins are calculated once, maximizing the physical pixels left for the actual plot panel.

Key Takeaways

  • Layers are logical stacks: They are isolated in specification but unified in mathematical space.
  • Interior mutability allows the global orchestrator to inject resolved scales into read-only layer definitions without expensive object copying.
  • Sequential rendering enforces a predictable Z-index layout, allowing complex visualizations to be built out of simple, composable geometric blocks.

Space Manager

In the Grammar of Graphics, defining data layers is only the first step. To create a professional visualization, you must orchestrate the placement of axes, legends, titles, and the plotting area itself. The Space Manager is Charton’s layout engine, responsible for the precise partitioning of the physical canvas.

The Box Model Philosophy

Charton adopts a nested "Box Model" similar to modern web layout engines. Every chart is composed of concentric rectangular regions, each serving a specific purpose:

  1. Plot Panel: The central core where coordinates are resolved and geometric marks are rendered.
  2. Axis Region: The area surrounding the Plot Panel, housing tick marks, labels, and axis titles.
  3. Guide Region: The outermost boundary used for placement of Legends and ColorBars.
  4. Canvas Padding: The external buffer area that prevents outer elements (such as rotated labels) from being clipped by the container boundary.

The Greedy Backfilling Algorithm

The Space Manager faces a classic "chicken-and-egg" problem: to determine the final size of the canvas, we need the axis label widths; but to calculate the density of axis labels, we need to know the available width.

Charton resolves this through a two-phase Backfilling Algorithm:

  • Phase 1: Estimation: The engine uses font metrics (such as estimate_text_width) to calculate the theoretical minimum space required by all Axis and Legend regions.
  • Phase 2: Constraint Injection: These calculated depths are injected into AxisLayoutConstraints and LegendLayoutConstraints structs. The Space Manager then shrinks the canvas from the outer edges toward the center, reserving the necessary space for guides and axes before finalizing the PanelContext for the core data plot.

Layout Control Strategies

The Space Manager offers fine-grained control to ensure visual uniformity across complex plots:

Synchronized Alignment

Even in multi-panel (faceted) visualizations, the Space Manager enforces "Synchronized Layouts." If Panel A has very long tick labels while Panel B has short ones, the Space Manager calculates the maximum required depth across all panels and applies it uniformly, ensuring that the plotting areas remain perfectly aligned.

Greedy Stacking (Flex-box logic)

When managing multiple legends or colorbars, the Space Manager applies a strategy similar to Flex-box layout:

  • Horizontal Stacking: Legends are laid out in a row; if the content exceeds the canvas width, it automatically wraps to a new row.
  • Vertical Stacking: When positioned on the sides, legends are stacked in columns, dynamically adjusting the axis spacing to accommodate the total height required by the consolidated legend blocks.

Aspect Ratio Preservation

While mark positions are data-driven, the Space Manager respects global aspect ratio constraints. If a fixed aspect ratio is requested, the manager calculates the optimal Panel size and treats the resulting excess pixels as additional outer padding, ensuring that the visual representation of the data remains undistorted.

Coordinate-Driven Compensation

The Space Manager is tightly coupled with the Coordinate System:

  • Rotation Compensation: When using Polar coordinates (e.g., in a radial plot), the Space Manager automatically detects the "angular sweep." It triggers a rotation compensation logic that recalculates the collision boundary for labels, preventing them from overlapping with the circular plotting area.
  • Flip Awareness: When a user invokes coord_flip(), the manager automatically swaps the depth-calculation logic for the axes. It recognizes that vertical labels in a standard plot become horizontal labels in a flipped plot, adjusting the padding calculations accordingly.

Key Takeaways

  • Nested Regions: Layout follows an "inside-out" box model progression.
  • Two-Phase Backfilling: The algorithm solves the cyclic dependency between label sizing and canvas allocation.
  • Synchronized Layouts: The engine ensures that multiple sub-plots maintain perfect alignment regardless of local content differences.
  • Adaptive Compensation: Layout strategies are coordinate-aware, automatically preventing overlaps based on whether the chart is Cartesian or Radial.

With this chapter, you have completed the journey through the Charton orchestration pipeline—from data ingestion and aesthetic mapping to spatial layout and rendering.

Guides: Axis & Legends

In a chart, the data marks (points, lines, bars) are the "what," but the Guides are the "how to read it." Guides act as the translation layer between abstract mathematical scales and human-readable visual cues. In Charton, these are primarily categorized into Axes (which interpret spatial mappings) and Legends (which interpret aesthetic mappings).

The Guide Hierarchy

Following the Grammar of Graphics, Charton treats Axes and Legends as secondary structures that are semantically derived from the primary Encoding specification.

  • Axes: Provide spatial context. They map the underlying continuous or discrete Scale domain to visual ticks and labels along the dimensions of the plotting area.
  • Legends: Provide categorical or gradient context. They map aesthetic scales (Color, Shape, Size) back to labeled groups or color ramps.

Axis Generation: From Math to Ticks

The creation of an axis is a multi-step process triggered during the rendering lifecycle:

  1. Tick Calculation: The system queries the Scale domain to determine the ideal intervals. For linear data, it uses a "pretty number" algorithm to ensure labels land on clean integers or decimal fractions. For temporal data, it selects sensible units (e.g., hours, days, or months).
  2. Physical Positioning: Using the Coordinate System (Cartesian or Polar), the axis generator calculates the pixel location for each tick. In Polar systems, this involves converting radial and angular distances into curved labels.
  3. Rotation & Collision: If labels are long or dense, the system calculates their physical footprint in pixels. If a conflict occurs, the system dynamically rotates labels (e.g., to 45-degree angles) to avoid overlap, ensuring readability.

Legend Consolidation: The Semantic Bridge

A major architectural feature in Charton is the Semantic Merging of legends. In many visualization libraries, color, shape, and size legends are generated separately, leading to cluttered interfaces. Charton optimizes this:

  • Field Mapping: The system scans all aesthetic mappings (Color, Shape, Size) and groups them by their data Field name.
  • Unified Guide Specs: If Color and Shape both map to the same field (e.g., "Category"), the system merges their definitions into a single GuideSpec.
  • Rendering Strategy: The legend renderer then determines the GuideKind (Legend vs. ColorBar):
    • Discrete Legend: Used for categorical data. It builds a list of visual symbols (e.g., colored squares, specific shapes) that represent the group.
    • ColorBar: Used for continuous gradients. It renders a continuous strip that maps the data domain to a visual color ramp.

Layout-Aware Rendering

Guides are not independent objects; they are aware of the chart's total physical constraints.

  • Space Reservation: Before the plotting area is determined, the Layout Engine queries the Guides for their required dimensions. If the axis labels are long, the engine increases the reserved padding on that side of the plot to prevent clipping.
  • Anchor Anchoring: Guides utilize a PanelContext to anchor themselves relative to the plot. For example, a legend placed on the Right position calculates its starting pixel coordinate based on the Panel width plus the theme-defined margin.

Why Consolidation Matters

The automated generation of these guides provides three core advantages:

  1. Mathematical Integrity: Because guides are generated directly from the resolved global scales, the labels are guaranteed to match the data precisely. You never have to manually update a label when the data changes.
  2. Reduced Visual Noise: By merging multiple aesthetics into a single legend block, the chart keeps the viewer's focus on the data, not on redundant interface elements.
  3. Automated Layout: Because the layout manager is aware of guide requirements, you don't need to manually configure margins. The chart automatically adjusts to accommodate the font sizes and number of categories present in your specific dataset.

Key Takeaways

Axes map space; Legends map aesthetics.

  • Guides are derived: They are not manually created, but are inferred directly from the Encoding and Scale specifications.
  • Semantic Merging: Multiple aesthetics mapping to the same field are consolidated into a single guide to minimize visual clutter.
  • Layout Awareness: Guides communicate their size requirements to the layout engine, ensuring the chart is always self-contained and perfectly padded.

Industrial Mastery

Performance & Scaling

When a visualization library transitions from an academic prototype to an industrial-grade production tool, the primary bottleneck shifts from architectural expressiveness to raw computational performance. In enterprise environments, charts are routinely required to process millions of observation rows, compute complex statistical overlays in real time, and render under tight latency budgets.

Charton achieves sub-millisecond execution speeds on large-scale datasets by optimizing three distinct layers: hardware-friendly data ingestion, high-performance associative hashing, and streamlined geometric compilation.

Zero-Copy Ingestion via Polars-Aligned Memory

Naive visualization tools often store data as rows of object instances, which introduces severe memory fragmentation and continuous pointer chasing. Charton eliminates this overhead by strictly enforcing a contiguous columnar memory layout.

As defined in the core data structures, a dataset is broken down into independent vectors of strongly typed primitives (ColumnVector variants such as Float64, Int32, or String).

  • Polars Optimization: This layout mirrors the memory alignment used by the Polars DataFrame engine. When you ingest a dataset from a Polars source, Charton can perform a near-zero-cost conversion, repurposing the underlying Arrow-backed memory buffers rather than duplicating arrays.
  • Cache Locality: By structuring data as contiguous primitive arrays (e.g., Vec<f64>), the CPU can efficiently pre-fetch values into its L1/L2 caches during axis scaling and coordinate transformations, maximizing hardware throughput.

High-Performance Hashing with ahash

Statistical marks—such as Box Plots and Kernel Density Estimations (KDE)—require the engine to group, bin, and partition continuous data rows based on categorical channels (like Color or X slots) before any drawing occurs. In standard implementations, these associative lookups represent a critical performance sink.

To solve this, Charton replaces Rust’s standard, cryptographically secure SipHasher with ahash (AHashMap and AHashSet) inside its core transform pipelines (transform_boxplot_data and transform_density_data).

$$\text{Throughput Gain} = \frac{\text{AHashMap Latency}}{\text{Standard Std Hash Latency}} \approx 3\times \text{ to } 5\times \text{ Acceleration}$$

Because visualization tasks operate within a controlled local context and do not require protection against Denial-of-Service (DoS) algorithmic attacks, ahash leverages customized hardware instructions (like AES-NI) to generate non-cryptographic hashes at a fraction of the CPU cycles. This ensures that grouping thousands of categorical intersections completes in microseconds.

Gap-Filling and Dense Categorical Alignment

When compiling complex multi-series statistical layouts (such as dodged box plots or stacked histograms), missing categories in certain sub-groups can break layout calculations, leading to uneven alignment grids.

Charton’s statistical transformers optimize this on-the-fly during the transformation step:

  1. Categorical Matrix Scanning: The transformer uses an AHashMap to compile the global cartesian product of all available categorical keys.
  2. Deterministic Gap Insertion: If a specific category combination lacks data points, the transformer automatically injects standard tracking boundaries or explicit f64::NAN rows.
  3. Downstream Predictability: This dense padding guarantees that the layout engine receives perfectly balanced arrays, allowing visual positions to be calculated in single-pass vector operations without nested runtime checks.

Continuous Value Bound Sampling

For computationally heavy transformations like Kernel Density Estimation (KDE), evaluating the probability density at every single raw coordinate across millions of rows is redundant and slow. Charton optimizes the transformation pipeline by isolating boundary evaluations:

  • Grid Sample Generation: The system scans the continuous target column to establish the true minimum and maximum global boundaries.
  • Fixed Linear Interpolation: It then projects a fixed, customizable grid (e.g., 512 discrete sampling steps) between these boundaries.
  • Vectorized KDE Execution: The underlying kernel functions (Normal, Epanechnikov, or Uniform) are evaluated exclusively against these synthesized grid points, turning an $O(N^2)$ point-cloud computation into an $O(N \cdot \text{Grid})$ linear operational pass.

Key Takeaways

  • Columnar Layouts: Mirroring Arrow/Polars memory structures minimizes transformation copies and maximizes CPU cache locality.
  • Ahash Acceleration: Relying on hardware-accelerated hashing makes grouping and categorizing thousands of data series practically free.
  • Fixed Grid Transformations: Decoupling raw rows from evaluation steps prevents complex statistics (like KDE) from scaling exponentially with data size.

Publication-Ready Themes

In enterprise-grade data analysis and professional publishing workflows, visual consistency is non-negotiable. Graphs produced by different teams or across different project cycles must strictly adhere to a unified visual identity. Manual adjustments of font sizes, line widths, and padding for every single plot are not only inefficient but also highly prone to error.

Charton addresses this requirement through a fully decoupled, semantic Theme System. By separating data-driven graphic specifications from non-data aesthetics, this system enables the automated enforcement of standardized visual guidelines, ensuring that every chart—from internal dashboards to formal reports—maintains a professional and cohesive appearance.

Architectural Foundation of Themes

As defined in theme.rs, a Theme is an immutable, comprehensive configuration container. Unlike traditional plotting libraries that might hard-code layout margins or text properties within geometric renderers, Charton’s architecture treats styles as externalized constants.

These themes are injected into the rendering pipeline through PanelContext, allowing the same data layer to be styled differently simply by swapping the theme reference. The system governs three critical stylistic dimensions:

  • Global Canvas & Spatial Margins: Controlled by parameters such as top_margin, right_margin, bottom_margin, and left_margin. These define the protective buffers around the canvas, ensuring that legends, titles, and axis elements do not bleed into the boundaries of the export surface.
  • Typographic Hierarchy: The system utilizes a font-stack strategy, governing specific layers—from title_size to tick_label_size and legend_label_size. This ensures clear, readable visual hierarchy across different output resolutions.
  • Visual Anchor Metrics: Geometric constraints such as axis_width, tick_length, legend_block_gap, and legend_marker_text_gap. By formalizing these metrics, the library maintains consistent layout density across varying chart complexities.
#![allow(unused)]
fn main() {
// Global theme injection ensures visual uniformity across all chart components.
let styled_chart = Chart::new(df)
    .mark_point()
    .encode((x("dosage"), y("efficacy")))
    .theme(Theme::corporate_standard()); // Enforces pre-defined visual guidelines
}

Layout-Safe Dynamic Adjustments

A primary challenge with static themes is that a layout ruleset might function perfectly for a simple scatter plot but fail in a dense, multi-panel (faceted) grid. Charton’s layout engine treats theme metrics as elastic constraints rather than fixed pixel values:

  • The Axis Reserve Buffer: Defined in theme.rs as axis_reserve_buffer, this creates a defensive boundary around the plotting panel. When text labels rotate (via x_tick_label_angle), the layout engine dynamically calculates the required space based on the rotated bounding box, preventing text truncation or boundary collisions.
  • Panel Defense Ratio: To prevent oversized legends or long categorical labels from shrinking the core visualization area to a sliver, the system enforces a panel_defense_ratio (defaulting to 0.2). This rule guarantees that no matter how many guide blocks are added to the outer regions, the central plotting area retains a significant percentage of the available canvas real estate.

Perceptually Uniform Palette Integration

Professional styling requires safeguarding the mathematical fidelity of continuous data. The theme engine integrates high-fidelity continuous mapping engines that prevent the artifacts common in naive RGB interpolation.

By utilizing predefined ColorMap strategies (such as Viridis or Magma), the theme ensures that data magnitude changes correspond linearly to human visual perception. This eliminates "false clustering"—a common visual bias in standard palettes—and ensures that every pixel accurately reflects the underlying data distribution.

Key Takeaways

  • Complete Separation of Concerns: Data encodings (via Chart<T>) define what is visualized, while the Theme architecture dictates how it is physically presented.
  • Stateless Rendering: Renderers do not maintain internal style states; they query theme constants via PanelContext, ensuring that visual updates are immediate and globally consistent.
  • Defensive Layout Guardrails: Mechanisms like panel_defense_ratio and axis_reserve_buffer ensure that chart structural integrity is maintained automatically, even under complex or crowded layout conditions.

Safety & Error Registry

In high-performance visualization systems, the most costly errors are not rendering failures, but "logical mapping errors"—such as attempting to map continuous data to a discrete aesthetic channel, or misapplying coordinate scaling logic. Charton’s architecture is built on the philosophy of "Fail-Fast at Compile-Time" and "Structured Diagnostics."

Type-Safe Grammar and Validation

Charton’s core architecture enforces the legitimacy of its "Grammar of Graphics" at compile time, preventing developers from constructing invalid visual encodings.

  • Generic Type Guards: Utilizing the state-machine pattern within Chart<T>, Charton locks down operational logic via Rust's type system. For instance, statistical transformations like transform_boxplot_data are only accessible to specific mark types. Any attempt to apply unsupported operations triggers a compile-time error.
  • Channel-Scale Alignment: During the chart specification phase, the system validates the consistency between Channel (e.g., Color, X, Y) and the requested Scale. By checking that the target data field matches the scale's domain, the system eliminates runtime "undefined mapping" risks.

Structured Error Registry

To ensure developers can rapidly diagnose complex visual issues, Charton replaces ambiguous error messages with a structured ChartonError enum. This registry centralizes the management of potential logical conflicts:

#![allow(unused)]
fn main() {
pub enum ChartonError {
    /// Error related to data handling or processing.
    #[error("Data error: {0}")]
    Data(String),
    /// Error related to mark definitions or configurations.
    #[error("Mark error: {0}")]
    Mark(String),
    /// Error related to encoding specifications.
    #[error("Encoding error: {0}")]
    Encoding(String),
}
}

Every error variant carries contextual metadata:

  • Context-Aware Diagnostics: When the LayoutEngine calculates legend dimensions, if a panel_defense_ratio violation occurs, the registry returns a detailed report containing the current Rect state. This clearly identifies exactly which legend block caused the spatial overflow.
  • Recoverable Diagnostics: For non-fatal data issues (e.g., outliers slightly beyond the domain), the system supports "Soft Warning" modes. These warnings are logged in the registry, allowing the chart to perform automatic clipping instead of crashing.

Semantic Validation Pipeline

Before any Chart object proceeds to the render phase, it must pass through a Semantic Validation stage—a defensive "wall" designed to ensure structural integrity:

  • Dimensional Validity: The system scans all Encoding mappings. If a MarkPoint is detected alongside incompatible StackMode parameters (usually reserved for MarkArea or MarkBar), the validator triggers a semantic error before expensive rendering begins.
  • Coordinate Scale Consistency: The system verifies that the coordinate system logic (e.g., Polar vs. Cartesian2D) matches the data's scale domains, preventing logical errors like using negative radial values in a polar chart.
  • CI/CD Integration: These error diagnostics can be serialized into JSON format. This allows for automated "Snapshot Testing" in CI/CD pipelines, where error-message hashes are compared to ensure that exception-handling mechanisms remain stable across library updates.

Key Takeaways

  • Compile-Time Safety: Leveraging Rust's strong type system, we intercept incorrect mapping logic during development rather than in the production runtime.
  • Structured Diagnostics: The error registry carries geometric context (panel sizes, coordinate parameters), significantly reducing the time required to debug complex layouts.
  • Semantic Integrity: The validation pipeline acts as a protective layer, ensuring that all generated charts adhere to the mathematical and geometric logic of the "Grammar of Graphics."

Production Integration

For a visualization library to succeed in an industrial ecosystem, it must extend beyond local rendering. It must integrate seamlessly into CI/CD pipelines, support automated regression testing, and provide deterministic output across various deployment environments—whether that is a cloud-native microservice, a WASM-powered browser interface, or a static report generation tool.

Automated Regression & Snapshot Testing

Visual libraries are notoriously difficult to test because the "correct" output is often subjective. Charton addresses this by implementing a Snapshot-Based Testing architecture.

  • Binary Snapshotting: Since Charton generates SVGs or binary image buffers through defined backends, we can serialize the output of a chart specification into a reference file. During CI, the library renders the chart and compares the output byte-by-byte (or via fuzzy perceptual hashing) against the golden master.
  • Deterministic Configuration: By forcing all layouts to rely on the Theme and Coordinate systems, we eliminate non-determinism caused by OS-specific font rendering or system-level float rounding, ensuring that a chart generated on a developer’s machine matches the production CI build exactly.

CI/CD Pipeline Integration

Charton is engineered to be a "headless" citizen of the cloud. The bridge system (matplotlib.rs, altair.rs) and the core rendering pipeline are optimized for containerized environments:

  • Minimal Footprint: By utilizing high-performance columnar data layouts, Charton ensures that memory overhead remains flat even when processing datasets that would crash standard plotting tools. This makes it ideal for running in restricted-resource container environments (e.g., AWS Lambda, K8s sidecars).
  • Feature Flagging: The composite.rs implementation demonstrates how output formats (PNG, PDF, etc.) are controlled via Cargo features. This allows production builds to prune unnecessary dependencies, drastically reducing the binary size and attack surface in production environments.

Deployment Strategies

When deploying Charton-powered services, we recommend three distinct patterns depending on the load:

Deployment PatternUse CaseImplementation Strategy
Edge Rendering (WASM)Interactive UI / DashboardsCompile the core library to WASM for client-side rendering, minimizing server load.
Headless Batch ServiceAutomated Report GenerationUse a containerized Rust service to consume IPC data, render to SVG/PNG, and pipe to S3.
Bridge-Integrated APIPrototyping / Hybrid AppsUtilize the Python bridge to leverage established libraries (Altair/Matplotlib) while maintaining data integrity via Rust’s polars types.

Semantic Validation in CI

The Semantic Validation pipeline described in our Error Registry is essential for production stability. Before a deployment is cleared:

  • Schema Check: CI runners parse the user-provided ChartSpec and perform a static analysis to ensure all channels (X, Y, Color) are bound to valid data types.
  • Bounds Audit: The pipeline evaluates the dataset range against the ScaleDomain. If the system detects potential data overflow or empty domains before rendering, the build fails immediately, preventing "blank chart" incidents in the production dashboard.

Key Takeaways

Deterministic Outputs: Leverage snapshot testing to ensure that cross-environment chart generation yields identical visual results.

Resource Optimization: Use feature-gated builds and columnar data handling to ensure high-throughput rendering within containerized cloud environments.

Headless Capability: Design services for headless batch processing by treating the rendering engine as a pure, stateless function that transforms Dataset + Theme into a serialized visual artifact.

Semaglutide Weight Loss Curve (NEJM 2021)

Background

This figure is a reproduction of Figure 1A from the landmark study "Once-Weekly Semaglutide in Adults with Overweight or Obesity", published in The New England Journal of Medicine (NEJM) in 2021. The study evaluates the efficacy and safety of semaglutide as a pharmacological intervention for weight management.

The plot illustrates the mean percentage change in body weight over a 68-week period. It highlights the significant divergence in weight loss trajectories between the semaglutide group and the placebo group, both of which were conducted alongside lifestyle interventions.

Data Acquisition

The data used for this visualization was extracted from the original publication using WebPlotDigitizer.

Implementation

Using Charton’s "Grammar of Graphics" approach, we can recreate this complex clinical plot by layering multiple graphical components, enabling highly flexible and customizable visualizations with concise Rust code.

use charton::prelude::*;
use std::error::Error;

// The data is obtained from paper "Once-Weekly Semaglutide in Adults with Overweight or Obesity"
// using [webplotdigitizer](https://automeris.io/).
fn main() -> Result<(), Box<dyn Error>> {
    // Placebo group data (control group)
    let ds_placebo = vec![
        // X-axis: time since randomization (weeks)
        (
            "Weeks since Randomization",
            [0, 4, 8, 12, 16, 20, 28, 36, 44, 52, 60, 68].into_column(),
        ),
        // Y-axis: mean percentage change in body weight from baseline. Negative values indicate weight loss
        (
            "Change from Baseline (%)",
            [
                0.00, -1.11, -1.72, -2.18, -2.54, -2.83, -2.82, -2.98, -3.24, -3.31, -3.22, -2.76,
            ]
            .into_column(),
        ),
        // Lower bound of the confidence interval (typically 95% CI)
        (
            "lower",
            [
                -0.042, -1.18, -1.81, -2.28, -2.66, -3.00, -3.03, -3.22, -3.49, -3.54, -3.46, -3.03,
            ]
            .into_column(),
        ),
        // Upper bound of the confidence interval (95% CI)
        (
            "upper",
            [
                0.042, -1.04, -1.63, -2.08, -2.42, -2.66, -2.61, -2.74, -2.99, -3.08, -2.98, -2.49,
            ]
            .into_column(),
        ),
    ]
    .to_dataset()?;

    // Semaglutide group data (treatment group)
    let ds_semaglutide = vec![
        (
            "Weeks since Randomization",
            [0, 4, 8, 12, 16, 20, 28, 36, 44, 52, 60, 68].into_column(),
        ),
        (
            "Change from Baseline (%)",
            [
                0.00, -2.27, -4.01, -5.9, -7.66, -9.46, -11.68, -13.33, -14.62, -15.47, -15.86,
                -15.6,
            ]
            .into_column(),
        ),
        (
            "lower",
            [
                -0.041, -2.3, -4.1, -5.98, -7.79, -9.58, -11.84, -13.55, -14.83, -15.72, -16.13,
                -15.86,
            ]
            .into_column(),
        ),
        (
            "upper",
            [
                0.041, -2.24, -3.92, -5.82, -7.53, -9.34, -11.52, -13.11, -14.41, -15.22, -15.59,
                -15.34,
            ]
            .into_column(),
        ),
    ]
    .to_dataset()?;

    // Text labels (placed at the right side of the plot)
    let ds_text = vec![
        ("x", [68.8, 68.8].into_column()),
        ("y", [-3.05, -15.86].into_column()),
        ("group", ["Placebo", "Semaglutide"].into_column()),
    ]
    .to_dataset()?;

    // Reference line (y = 0 → no weight change)
    let ds_reference = vec![
        ("x", [0.0, 76.0].into_column()),
        ("y", [0.0, 0.0].into_column()),
    ]
    .to_dataset()?;

    // Layer 1: Placebo points (markers at each time point)
    let placebo_point = Chart::build(&ds_placebo)?
        .mark_point()?
        .configure_point(|p| {
            p.with_color("#818284")
                .with_shape("triangle")
                .with_size(5.0)
        })
        .encode((
            alt::x("Weeks since Randomization"),
            alt::y("Change from Baseline (%)"),
        ))?;

    // Layer 2: Placebo line (connects the points)
    let placebo_line = Chart::build(&ds_placebo)?
        .mark_line()?
        .configure_line(|l| l.with_color("#818284"))
        .encode((
            alt::x("Weeks since Randomization"),
            alt::y("Change from Baseline (%)"),
        ))?;

    // Layer 3: Placebo error bars (confidence intervals)
    let placebo_errorbar = Chart::build(&ds_placebo)?
        .mark_errorbar()?
        .configure_errorbar(|e| {
            e.with_color("#818284")
                .with_cap_length(4.0)
                .with_stroke_width(1.5)
        })
        .encode((
            alt::x("Weeks since Randomization"),
            alt::y("lower"),
            alt::y2("upper"),
        ))?;

    // Layer 4: Placebo text label
    let placebo_text = Chart::build(ds_text.head(1))?
        .mark_text()?
        .configure_text(|t| t.with_anchor("left").with_size(14.0))
        .encode((alt::x("x"), alt::y("y"), alt::text("group")))?;

    // Layer 5: Semaglutide points
    let semaglutide_point = Chart::build(&ds_semaglutide)?
        .mark_point()?
        .configure_point(|p| p.with_color("#5b88c3").with_shape("square").with_size(3.0))
        .encode((
            alt::x("Weeks since Randomization"),
            alt::y("Change from Baseline (%)"),
        ))?;

    // Layer 6: Semaglutide line
    let semaglutide_line = Chart::build(&ds_semaglutide)?
        .mark_line()?
        .configure_line(|l| l.with_color("#5b88c3"))
        .encode((
            alt::x("Weeks since Randomization"),
            alt::y("Change from Baseline (%)"),
        ))?;

    // Layer 7: Semaglutide error bars
    let semaglutide_errorbar = Chart::build(&ds_semaglutide)?
        .mark_errorbar()?
        .configure_errorbar(|e| {
            e.with_color("#5b88c3")
                .with_cap_length(4.0)
                .with_stroke_width(1.5)
        })
        .encode((
            alt::x("Weeks since Randomization"),
            alt::y("lower"),
            alt::y2("upper"),
        ))?;

    // Layer 8: Semaglutide text label
    let semaglutide_text = Chart::build(ds_text.tail(1))?
        .mark_text()?
        .configure_text(|t| t.with_anchor("left").with_size(14.0))
        .encode((alt::x("x"), alt::y("y"), alt::text("group")))?;

    // Layer 9: Reference line (baseline at 0%)
    let reference_line = Chart::build(&ds_reference)?
        .mark_line()?
        .configure_line(|l| l.with_dash([6.0, 6.0]))
        .encode((alt::x("x"), alt::y("y")))?;

    // Combine all layers (Grammar of Graphics composition)
    placebo_point
        .and(reference_line)
        .and(placebo_line)
        .and(placebo_errorbar)
        .and(placebo_text)
        .and(semaglutide_point)
        .and(semaglutide_line)
        .and(semaglutide_errorbar)
        .and(semaglutide_text)
        .with_x_expand(Expansion {
            mult: (0.00, 0.0),
            add: (0.0, 0.0),
        })
        .with_y_expand(Expansion {
            mult: (0.15, 0.01),
            add: (0.0, 0.0),
        })
        .with_size(1000, 400)
        .with_right_margin(0.08)
        .with_left_margin(0.02)
        .with_top_margin(0.02)
        .with_bottom_margin(0.03)
        .with_x_ticks([
            0.0, 4.0, 8.0, 12.0, 16.0, 20.0, 28.0, 36.0, 44.0, 52.0, 60.0, 68.0,
        ])
        .with_y_ticks([
            0.0, -2.0, -4.0, -6.0, -8.0, -10.0, -12.0, -14.0, -16.0, -18.0,
        ])
        .save("docs/src/images/weight_loss_curve.svg")?;

    Ok(())
}

The 50k-Point Lorenz Attractor (WGPU & Zero-Allocation WASM)

The previous chapter's WGPU implementation (Chapter WebAssembly & Vega-Lite JSON, Part 2) was a "naive" port that re-allocated heap memory every frame. To push WebAssembly and your GPU to their limits, we will render a Lorenz Attractor—a complex, non-repeating 3D trajectory—simulating 50,000 dynamic particles at a locked 60 FPS.

Crucially, we will refactor our WASM boundary to achieve Zero-Allocation (Zero Malloc) during the render loop.

The Bottleneck: Why "Naive" WASM Stutters

Pushing Part 2's code to 50,000 points causes micro-stutters. The CPU, not the GPU, struggles with memory allocation. Our previous render_chart_gpu function ran this every frame:

#![allow(unused)]
fn main() {
let ds = Dataset::new()
    .with_column("x", xs.to_vec()) // Heap Allocation!
    // ...
}

Calling to_vec() 60 times a second on 50k points forces constant memory allocation and deallocation, choking the CPU and starving the GPU. The fix is shifting from a Stateless API to a Stateful Architecture using a persistent Rust struct to reuse memory.

Rust: The Stateful LiveChartApp

We will create a persistent Rust object that pre-allocates memory for our points at startup. During the animation loop, Charton's update_column_f64 performs an in-place memory copy, bypassing heap allocations entirely.

Update your src/lib.rs:

#![allow(unused)]
fn main() {
use charton::prelude::*;
use wasm_bindgen::prelude::*;

#[wasm_bindgen]
pub struct LiveChartApp {
    dataset: Dataset,
    canvas_id: String,
}

#[wasm_bindgen]
impl LiveChartApp {
    #[wasm_bindgen(constructor)]
    pub fn new(canvas_id: String, capacity: usize) -> Result<LiveChartApp, JsValue> {
        let zeros = vec![0.0; capacity];

        let dataset = Dataset::new()
            .with_column("x", zeros.clone())
            .map_err(|e| e.to_string())?
            .with_column("y", zeros.clone())
            .map_err(|e| e.to_string())?
            .with_column("intensity", zeros)
            .map_err(|e| e.to_string())?;

        Ok(Self { dataset, canvas_id })
    }

    pub async fn update_and_render(
        &mut self,
        xs: &[f64],
        ys: &[f64],
        colors: &[f64],
    ) -> Result<(), JsValue> {
        // Zero-allocation memory overwrite
        self.dataset
            .update_column_f64("x", xs)
            .map_err(|e| e.to_string())?;
        self.dataset
            .update_column_f64("y", ys)
            .map_err(|e| e.to_string())?;
        self.dataset
            .update_column_f64("intensity", colors)
            .map_err(|e| e.to_string())?;

        // Build lightweight declarative chart and flush to WGPU Canvas
        Chart::build(self.dataset.clone())
            .map_err(|e| e.to_string())?
            .mark_point()
            .map_err(|e| e.to_string())?
            .configure_point(|p| p.with_size(1.0).with_opacity(0.4))
            .encode((
                // Lock domain to avoid full scan, adaptively fit based on standard Lorenz range
                alt::x("x"),
                alt::y("y"),
                alt::color("intensity"),
            ))
            .map_err(|e| e.to_string())?
            // Responsive dimensions fully controlled by frontend canvas styles
            .configure_theme(|t| {
                t.with_background_color("#090d16")
                    .with_color_map(ColorMap::Plasma)
                    .with_show_axes(false)
                    .with_show_legend(false)
                    .with_top_margin(0.10)
                    .with_bottom_margin(0.0)
                    .with_left_margin(0.0)
                    .with_right_margin(0.0)
            })
            .render_to_canvas(&self.canvas_id)
            .await
            .map_err(|e| e.to_string())?;

        Ok(())
    }
}
}

JavaScript: Chaos Math & 3D Projection

In index.html, we calculate the Lorenz equations. To create a rotating 3D effect, we compute coordinates once, then apply a dynamic 3D-to-2D rotation matrix each frame before streaming data to WASM.

Replace index.html with this:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Charton WASM — 50k Lorenz Chaos Engine</title>
    <style>
        :root {
            --bg-color: #05070f;
            --panel-bg: rgba(13, 17, 23, 0.75);
            --accent-color: #00f2ff;
            --accent-glow: rgba(0, 242, 255, 0.4);
            --text-color: #c9d1d9;
            --border-color: #21262d;
        }

        body {
            font-family: 'Segoe UI', system-ui, -apple-system, sans-serif;
            background: var(--bg-color);
            color: var(--text-color);
            margin: 0;
            padding: 0;
            height: 100vh;
            display: flex;
            justify-content: center;
            align-items: center; 
            background: radial-gradient(circle at center, #0f1626 0%, #05070f 100%);
            overflow: hidden;
        }

        /* Main container: max width limited to 1080px */
        #app-container {
            display: flex;
            width: 90vw;
            max-width: 1080px; 
            height: 80vh; 
            max-height: 720px; 
            background: #090d16;
            border-radius: 16px;
            border: 1px solid rgba(0, 242, 255, 0.15);
            box-shadow: 0 24px 60px rgba(0, 0, 0, 0.8);
            overflow: hidden;
        }

        /* Left panel: ultra-compact layout */
        #control-panel {
            width: 260px;
            background: var(--panel-bg);
            border-right: 1px solid var(--border-color);
            padding: 1.2rem; 
            display: flex;
            flex-direction: column;
            gap: 0.8rem; 
            box-sizing: border-box;
            backdrop-filter: blur(12px);
            height: 100%;
            overflow: hidden; /* Lock viewport, prevent scrollbars */
        }

        h2 {
            font-size: 0.95rem; 
            margin: 0;
            color: #fff;
            text-transform: uppercase;
            letter-spacing: 1px;
            border-bottom: 2px solid var(--border-color);
            padding-bottom: 0.4rem;
            flex-shrink: 0;
        }

        .stats-box {
            background: rgba(255, 255, 255, 0.03);
            border: 1px solid var(--border-color);
            border-radius: 6px;
            padding: 0.4rem 0.6rem;
            font-family: monospace;
            font-size: 0.75rem; 
            flex-shrink: 0;
        }

        .stat-line {
            display: flex;
            justify-content: space-between;
            margin-bottom: 0.15rem;
        }

        .stat-value {
            color: var(--accent-color);
            font-weight: bold;
            text-shadow: 0 0 8px var(--accent-glow);
        }

        #controls-container {
            flex: 1;
            display: flex;
            flex-direction: column;
            gap: 0.8rem; 
            overflow: hidden;
        }

        .control-group {
            display: flex;
            flex-direction: column;
            gap: 0.2rem; 
            flex-shrink: 0;
            box-sizing: border-box;
        }

        label {
            font-size: 0.75rem; 
            color: #8b949e;
            display: flex;
            justify-content: space-between;
        }

        .label-val {
            color: #fff;
            font-family: monospace;
        }

        /* Ultra-thin slider style design */
        input[type="range"] {
            -webkit-appearance: none;
            -moz-appearance: none;
            appearance: none;
            
            width: 100%;
            background: #161b22;
            height: 4px; 
            border-radius: 2px;
            outline: none;
            margin: 4px 0; 
            flex-shrink: 0;
        }

        input[type="range"]::-webkit-slider-thumb {
            -webkit-appearance: none;
            appearance: none;
            
            width: 12px; 
            height: 12px;
            border-radius: 50%;
            background: var(--accent-color);
            cursor: pointer;
            box-shadow: 0 0 6px var(--accent-color);
            transition: transform 0.1s;
        }

        input[type="range"]::-moz-range-thumb {
            border: none;
            width: 12px;
            height: 12px;
            border-radius: 50%;
            background: var(--accent-color);
            cursor: pointer;
            box-shadow: 0 0 6px var(--accent-color);
            transition: transform 0.1s;
        }

        /* Right canvas area */
        #stage {
            flex: 1;
            height: 100%;
            position: relative;
            background: #090d16;
        }

        #chart-canvas {
            width: 100%;
            height: 100%;
            display: block;
        }
    </style>
</head>
<body>

    <div id="app-container">
        
        <div id="control-panel">
            <h2>🦋 Lorenz Chaos</h2>
            
            <div class="stats-box">
                <div class="stat-line">
                    <span>Particles:</span>
                    <span class="stat-value">50,000</span>
                </div>
                <div class="stat-line">
                    <span>Performance:</span>
                    <span class="stat-value"><span id="fps-counter">0</span> FPS</span>
                </div>
            </div>

            <div id="controls-container">
                <div class="control-group">
                    <label>Sigma (σ) <span id="val-sigma" class="label-val">10.0</span></label>
                    <input type="range" id="param-sigma" min="1.0" max="30.0" step="0.1" value="10.0">
                </div>

                <div class="control-group">
                    <label>Rho (ρ) <span id="val-rho" class="label-val">28.0</span></label>
                    <input type="range" id="param-rho" min="5.0" max="50.0" step="0.1" value="28.0">
                </div>

                <div class="control-group">
                    <label>Beta (β) <span id="val-beta" class="label-val">2.67</span></label>
                    <input type="range" id="param-beta" min="0.5" max="5.0" step="0.01" value="2.666">
                </div>

                <div class="control-group">
                    <label>Rotation Speed <span id="val-speed" class="label-val">1.0</span></label>
                    <input type="range" id="param-speed" min="0.0" max="3.0" step="0.1" value="1.0">
                </div>
            </div>
        </div>

        <div id="stage">
            <canvas id="chart-canvas"></canvas>
        </div>

    </div>

    <script type="module">
        import init, { LiveChartApp } from './pkg/wave.js';

        async function run() {
            await init();

            const TOTAL_POINTS = 50000;
            const fpsCounter = document.getElementById('fps-counter');

            const sliders = {
                sigma: document.getElementById('param-sigma'),
                rho: document.getElementById('param-rho'),
                beta: document.getElementById('param-beta'),
                speed: document.getElementById('param-speed')
            };
            const labels = {
                sigma: document.getElementById('val-sigma'),
                rho: document.getElementById('val-rho'),
                beta: document.getElementById('val-beta'),
                speed: document.getElementById('val-speed')
            };

            Object.keys(sliders).forEach(key => {
                sliders[key].addEventListener('input', (e) => {
                    labels[key].textContent = parseFloat(e.target.value).toFixed(2);
                });
            });

            const app = new LiveChartApp("chart-canvas", TOTAL_POINTS);

            const xs = new Float64Array(TOTAL_POINTS);
            const ys = new Float64Array(TOTAL_POINTS);
            const colors = new Float64Array(TOTAL_POINTS);

            const lx = new Float64Array(TOTAL_POINTS);
            const ly = new Float64Array(TOTAL_POINTS);
            const lz = new Float64Array(TOTAL_POINTS);

            let angle = 0;
            let lastTime = performance.now();

            function computeLorenzTrajectory() {
                const sigma = parseFloat(sliders.sigma.value);
                const rho = parseFloat(sliders.rho.value);
                const beta = parseFloat(sliders.beta.value);
                
                let x = 0.1, y = 0.0, z = 0.0;
                const dt = 0.005;

                for (let i = 0; i < TOTAL_POINTS; i++) {
                    x += sigma * (y - x) * dt;
                    y += (x * (rho - z) - y) * dt;
                    z += (x * y - beta * z) * dt;
                    
                    lx[i] = x; 
                    ly[i] = y; 
                    lz[i] = z;
                    colors[i] = z;
                }
            }

            computeLorenzTrajectory();

            async function frameLoop() {
                const now = performance.now();
                fpsCounter.textContent = Math.round(1000 / (now - lastTime));
                lastTime = now;

                computeLorenzTrajectory();

                const speedModifier = parseFloat(sliders.speed.value);
                angle += 0.01 * speedModifier;
                
                const cosA = Math.cos(angle);
                const sinA = Math.sin(angle);

                for (let i = 0; i < TOTAL_POINTS; i++) {
                    xs[i] = lx[i] * cosA - ly[i] * sinA;
                    ys[i] = lz[i]; 
                }

                try {
                    await app.update_and_render(xs, ys, colors);
                } catch (e) {
                    console.error("WGPU Render failed:", e);
                }

                requestAnimationFrame(frameLoop);
            }

            frameLoop();
        }
        run();
    </script>
</body>
</html>

The Result: High-Density Chaos

Running this reveals 50,000 points swirling fluidly in 3D. The small, semi-transparent points visually accumulate in dense areas, making the "butterfly wings" physically glow. By eliminating heap allocations, CPU usage drops drastically, allowing the requestAnimationFrame loop to feed WGPU at your monitor's maximum refresh rate.

Summary: Rules for High-Performance WASM

Keep these golden rules in mind for data-intensive JS/WASM applications:

  1. State is King: Avoid stateless APIs for high-frequency loops. Use a persistent Rust struct (#[wasm_bindgen] pub struct...) to hold heap-allocated memory.

  2. In-Place Mutation: Avoid .to_vec() (which triggers a malloc). Mutate vectors in place to reuse capacity, turning heavy memory operations into lightning-fast memcopies.

  3. Leverage Instancing: Pass flat float arrays (Float64Array) from JS to Rust. Charton pipes this raw geometry directly into WGPU Vertex Buffers, enabling parallel GPU rendering without CPU bottlenecks.

True interactive visualization via the Altair backend

Charton can generate fully interactive charts by delegating to Altair, which compiles to Vega-Lite specifications capable of:

  • Hover tooltips
  • Selections
  • Brush interactions
  • Zoom and pan
  • Linked views
  • Filtering and conditional styling
  • Rich UI semantics

Charton’s role in this workflow

Charton does:

  1. Run Rust-side preprocessing (Polars)
  2. Transfer data to Python
  3. Embed user-provided Altair plotting code
  4. Invoke Python to generate Vega-Lite JSON
  5. Display the result (browser/Jupyter) or export JSON

All actual interactivity comes from Altair/Vega-Lite, not from Charton.

Example: interactive Altair chart via Charton

#![allow(unused)]
fn main() {
:dep charton = { version="0.3" }
:dep polars = { version="0.49" }

use charton::prelude::*;
use polars::prelude::df;

let exe_path = r"D:\Programs\miniconda3\envs\cellpy\python.exe";

let df1 = df![
    "Model" => ["S1", "M1", "R2", "P8", "M4", "T5", "V1"],
    "Price" => [2430, 3550, 5700, 8750, 2315, 3560, 980],
    "Discount" => [Some(0.65), Some(0.73), Some(0.82), None, Some(0.51), None, Some(0.26)],
].unwrap();

// Any valid Altair code can be placed here.
let raw_plotting_code = r#"
import altair as alt

chart = alt.Chart(df1).mark_point().encode(
    x='Price',
    y='Discount',
    color='Model',
    tooltip=['Model', 'Price', 'Discount']
).interactive()        # <-- zoom + pan + scroll
"#;

Plot::<Altair>::build(data!(&df1)?)?
    .with_exe_path(exe_path)?
    .with_plotting_code(raw_plotting_code)
    .show()?;  // Jupyter or browser
}

This provides real interactivity entirely through Altair.

Exporting Vega-Lite JSON for browser/Web app usage

Since Altair compiles to Vega-Lite, Charton can generate the JSON specification directly.

This is ideal for:

  • Web dashboards
  • React / Vue / Svelte components
  • Embedding charts in HTML
  • APIs returning visualization specs
  • Reproducible visualization pipelines

Example: Export to JSON

#![allow(unused)]
fn main() {
let chart_json: String = Plot::<Altair>::build(data!(&df1)?)?
    .with_exe_path(exe_path)?
    .with_plotting_code(raw_plotting_code)
    .to_json()?;

// save, embed, or send via API
println!("{}", chart_json);
}

The generated Vega-Lite JSON specification will look like this:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.20.1.json",
  "data": {
    "name": "data-8572dbb2f2fe2e54e92fc99f68a5f076"
  },
  "datasets": {
    "data-8572dbb2f2fe2e54e92fc99f68a5f076": [
      {
        "Discount": 0.65,
        "Model": "S1",
        "Price": 2430
      },
      // ... more data rows ...
    ]
  },
  "encoding": {
    "color": {
      "field": "Model",
      "type": "nominal"
    },
    "x": {
      "field": "Price",
      "type": "quantitative"
    },
    // ... other encoding and properties ...
  },
  "mark": {
    "type": "point"
  }
}

Embedding in a webpage:

To render the visualization, simply embed the generated JSON into your HTML using the vega-embed library:

<div id="vis"></div>
<script>
  var spec = /* paste JSON here */;
  vegaEmbed('#vis', spec);
</script>

Summary: Hybrid Power

By leveraging Altair as a backend, Charton offers a unique "hybrid" workflow that combines the best of two worlds:

  1. Rust Efficiency: Handle heavy data crunching and complex Polars transformations with type safety and maximum performance.
  2. Python Ecosystem: Access the vast, mature visualization capabilities of Altair/Vega-Lite without leaving your Rust development environment.

Whether you are performing rapid Exploratory Data Analysis in a Jupyter notebook or shipping high-fidelity interactive dashboards to a web frontend, this bridge ensures you never have to choose between performance and features.

Part 1: WASM CPU-Driven SVG Animation

This chapter walks a complete beginner from zero to a real-time, animated scatter plot running in the browser, powered by Rust and WebAssembly.

The goal of this foundational stage is to understand the core Rust-to-WASM compilation pipeline. Here, we expose a draw_wave() function from Rust that utilizes the CPU to procedurally compute data points and serialize them into a standard SVG text string. JavaScript then drives the animation loop by continuously replacing the SVG markup inside the browser's DOM at an introductory ~20–25 FPS.

Architectural Note: Every single frame in this implementation is a brand‑new, heavy XML text string computed entirely on the CPU within the WebAssembly sandbox. While perfect for lightweight or static charting, this text-swapping approach serves as our baseline benchmark. It sets the stage for the true power of GPU hardware acceleration explored in Part 2.

0) Prerequisites

  • Rust toolchain (stable) – install via rustup
  • wasm-pack – install with: cargo install wasm-pack
  • A static file server – choose one:
    • Python: python -m http.server 8080 (make sure Python has been installed)
    • Node.js: npx serve .
    • Or any other HTTP server (browsers require HTTP for WASM)
  • clang (may be required on some systems)
    • Linux: sudo apt install clang
    • Windows: Download LLVM from releases.llvm.org and select Add LLVM to the system PATH
    • macOS: usually pre‑installed with Xcode command line tools

Important compatibility note: charton v0.5 depends on getrandom, which needs special configuration for wasm32-unknown-unknown. This tutorial includes all required settings.

1) Project Layout

Create a new project (e.g., cargo new wave --lib) and set up the following structure:

wave
├── Cargo.toml
├── index.html
├── pkg
└── src
    └── lib.rs

We will build a cdylib wasm package that wasm-pack will wrap into pkg/.

2) Cargo.toml

Put this into wave/Cargo.toml:

[package]
name = "wave"
version = "0.1.0"
edition = "2024"

[lib]
crate-type = ["cdylib"]     # Produces a dynamic library for WASM

[dependencies]
wasm-bindgen = "0.2"        # JS ↔ Rust bridge
charton = "0.5"             # Declarative plotting library

# getrandom must be explicitly added with the "wasm_js" feature flag
# for wasm32-unknown-unknown target support.
getrandom = { version = "0.3", features = ["wasm_js"] }

[profile.release]
opt-level = "s"             # Optimize for size
lto = true                  # Link-time optimization
codegen-units = 1           # Better optimization
panic = "abort"             # Smaller panic handler

3) src/lib.rs- Rust (wasm entry points)

Create a lib.rs file in the src directory and add the following code:

#![allow(unused)]
fn main() {
//! Charton WASM demo: real-time animated line chart with color gradient.
//!
//! This module exposes a single function, `draw_wave`, which takes three
//! numeric arrays and returns an SVG string. The color channel is mapped
//! directly to the y-value, producing a continuous color gradient along the line.

use wasm_bindgen::prelude::*;
use charton::prelude::*;

/// Generate an SVG line chart with a color gradient.
///
/// # Arguments
/// * `xs` - X-axis values (e.g., time steps)
/// * `ys` - Y-axis values (e.g., amplitude)
/// * `colors` - Values for the continuous color scale (can be the same as `ys`)
///
/// # Returns
/// A `Result` containing the SVG string or a JavaScript error.
#[wasm_bindgen]
pub fn draw_wave(
    xs: Vec<f64>,
    ys: Vec<f64>,
    colors: Vec<f64>,
) -> Result<String, JsValue> {
    // Build a Charton Dataset from the three columns
    let ds = Dataset::new()
        .with_column("x", xs)
        .map_err(|e| JsValue::from_str(&e.to_string()))?
        .with_column("y", ys)
        .map_err(|e| JsValue::from_str(&e.to_string()))?
        .with_column("color", colors)
        .map_err(|e| JsValue::from_str(&e.to_string()))?;

    // Build a chart using the declarative API
    let chart = Chart::build(ds)
        .map_err(|e| JsValue::from_str(&e.to_string()))?
        .mark_point()                                       // Use a line mark
        .map_err(|e| JsValue::from_str(&e.to_string()))?
        .encode((                                           // Map columns to visual channels
            alt::x("x"),
            alt::y("y"),
            alt::color("color"),                            // Continuous color scale
        ))
        .map_err(|e| JsValue::from_str(&e.to_string()))?
        .with_size(800, 400)
        .configure_theme(|t| t.with_left_margin(0.01).with_top_margin(0.12).with_bottom_margin(0.05));

    // Render the chart to a static SVG string
    let svg = chart
        .to_svg()
        .map_err(|e| JsValue::from_str(&e.to_string()))?;

    Ok(svg)
}
}

4) Build with wasm-pack

From the project root (wave/):

wasm-pack build --release --target web --out-dir pkg

wasm-pack will:

  • Compile to wasm32-unknown-unknown
  • Run wasm-bindgen to generate JavaScript bindings
  • Output everything into pkg/:
    • wave_bg.wasm – the compiled WebAssembly binary
    • wave_bg.wasm.d.ts – TypeScript declaration file describing the shape of the compiled .wasm module.
    • wave.js – ES module bootstrap
    • wave.d.ts – TypeScript declarations (optional)

The .wasm file is roughly 300 kb in release mode. Gzip or Brotli compression can bring it down further, perfectly fine for web delivery.

5) index.html – Animated Frontend

Create index.html in the project root. The JavaScript:

  • Initialises the WASM module
  • Runs an animation loop with requestAnimationFrame
  • Pushes a new data point (sine wave + noise) every ~40 ms
  • Passes the arrays to draw_wave() and replaces the SVG in the DOM
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Charton WASM — Gradient Wave</title>
    <style>
        /* Dark background to make the gradient pop */
        body {
            font-family: system-ui, sans-serif;
            display: flex;
            flex-direction: column;
            align-items: center;
            margin: 0;
            padding-top: 2rem;
            background: #0d1117;
            color: #c9d1d9;
        }
        #chart {
            width: 800px;
            height: 400px;
            border-radius: 12px;
            background: #161b22;
            border: 1px solid #30363d;
            box-shadow: 0 4px 16px rgba(0,0,0,0.6);
        }
        .tag {
            margin-top: 1rem;
            font-size: 0.9rem;
            color: #8b949e;
        }
    </style>
</head>
<body>
    <h2>🌈 Charton + WASM — Gradient Wave</h2>
    <div id="chart"></div>
    <div class="tag">Every frame is a brand‑new SVG computed by Rust in WebAssembly</div>

    <script type="module">
        // Import the generated JS glue and the Rust function
        import init, { draw_wave } from './pkg/wave.js';

        async function run() {
            // Boot the WASM module
            await init();

            const container = document.getElementById('chart');
            const WINDOW_SIZE = 200;          // Show the latest 200 data points
            const ADD_INTERVAL_MS = 40;       // Add a new point every 50ms
            let xs = [];
            let ys = [];
            let t = 0;                        // Time counter
            let lastAdd = 0;                  // Timestamp of the last data addition

            // Animation loop driven by requestAnimationFrame
            function loop(timestamp) {
                // Only append a new point if enough time has elapsed
                if (timestamp - lastAdd >= ADD_INTERVAL_MS) {
                    // Sine wave with a little noise for a more organic look
                    const y = Math.sin(t * 0.3) + (Math.random() - 0.5) * 0.2;
                    xs.push(t);
                    ys.push(y);

                    // Keep only the latest WINDOW_SIZE points
                    if (xs.length > WINDOW_SIZE) {
                        xs.shift();
                        ys.shift();
                    }
                    t += 0.5;
                    lastAdd = timestamp;

                    try {
                        // The color column is just a copy of the y-values,
                        // which gives a nice blue‑to‑orange gradient.
                        const svg = draw_wave(xs, ys, [...ys]);
                        container.innerHTML = svg;
                    } catch (e) {
                        console.error(e);
                    }
                }
                requestAnimationFrame(loop);
            }

            requestAnimationFrame(loop);
        }

        run();
    </script>
</body>
</html>

6) Serve and View

Open a terminal in the project directory and start a local server:

python -m http.server 8080

Then open http://localhost:8080 in your browser.

You will see a dark-themed page with a flowing stream of coloured dots – the colour changes smoothly from cool (trough) to warm (peak), and the entire chart is re‑rendered from scratch by Rust on every frame.

7) Troubleshooting

  • Compilation freezes / high RAM usage – Building for WASM can be heavy. If the process hangs during wasm-opt, you can stop it manually; the unoptimised .wasm is already functional and will run in the browser.
  • wasm-opt errors – If wasm-pack fails to install or run wasm-opt, ignore the error as long as pkg/ has been populated.
  • Port already in use – Try a different port: python -m http.server 8000.
  • Chart appears but no colour gradient – Make sure you are passing three vectors to draw_wave and that the third one is a numeric array (not all the same value). Check the browser console for any Rust panics.
  • Blank page or CORS errors – Always use an HTTP server, never open the HTML file directly with file://.

What's Next?

  • Adjust the animation speed by changing t += 0.5 and ADD_INTERVAL_MS and WINDOW_SIZE in index.html.
  • Replace the sine wave with real‑time data fetched from an API.
  • Explore Polars integration to pre‑process large datasets in the browser before plotting.

Part 2: Blazing-Fast GPU Acceleration via WGPU & Hybrid Typography

Building directly upon the foundations laid in Part 1 (where we created an animated SVG chart driven by the CPU), this chapter upgrades our rendering engine into a high-performance, native GPU pipeline.

When dealing with massive datasets—such as real-time financial tickers, high-frequency bioinformatic sequences, or dense sensor streams—the CPU-driven SVG approach hits an immediate bottleneck. Every frame requires generating megabytes of XML text strings, causing the browser DOM to choke at just a few hundred points.

To break through this wall, we will rewrite our application to leverage WGPU (WebGPU/WebGL2) inside WebAssembly. We will push our dataset from a modest 200 points to 50,000+ points, rendering them smoothly at a locked 60 FPS. Furthermore, we will implement your engine's signature feature: the Zero-Allocation Deferred Ledger Architecture, which offloads text rendering back to the browser's native Canvas 2D subsystem for perfect subpixel anti-aliasing.

Every frame is fully rendered via hardware acceleration directly on your graphics card.

0) Prerequisites

  • You must complete Part 1 and ensure your Rust WASM toolchain (wasm-pack, stable Rust) is fully operational.
  • A WebGPU-capable browser (Chromium-based browsers like Chrome/Edge v113+, or Firefox/Safari with experimental flags enabled). If WebGPU is unavailable, wgpu will automatically and seamlessly fall back to hardware-accelerated WebGL2.

1) Project Layout

We will modify the project layout from Part 1 to target an HTML <canvas> surface instead of an injected SVG <div>:

wave
├── Cargo.toml
├── index.html
├── pkg
└── src
    └── lib.rs

2) Cargo.toml

We need to enable the wgpu feature flag for charton and include the mandatory browser dependencies (web-sys and wasm-bindgen-futures for handling async GPU device requests).

Update your wave/Cargo.toml to match the following configuration:

[package]
name = "wave"
version = "0.2.0"
edition = "2024"

[lib]
crate-type = ["cdylib"]

[dependencies]
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4" # Required for awaiting asynchronous GPU adapters
charton = { version = "0.5", features = ["wgpu"] } # Enable WGPU feature

# Required for web-native target handling and canvas contexts
web-sys = { version = "0.3", features = [
    "Window",
    "Document",
    "HtmlCanvasElement",
    "OffscreenCanvas",
    "CanvasRenderingContext2d"
] }

getrandom = { version = "0.3", features = ["wasm_js"] }

[profile.release]
opt-level = "s"
lto = true
codegen-units = 1
panic = "abort"

3) src/lib.rs - The GPU Hardware Architecture

In this module, we expose an asynchronous render_chart_gpu function.

Instead of generating and serializing heavy string data back to JavaScript, the Rust engine directly acquires a handle to your browser's canvas, spawns an isolated OffscreenCanvas for the WGPU render pipeline, flushes pure geometric primitives directly into VRAM, and hands a lightweight text ledger back to the browser's native typography engine.

Replace the contents of src/lib.rs with the following implementation:

#![allow(unused)]
fn main() {
//! Charton WASM WGPU Demo: Extreme high-performance scatter plot rendering.
//! 
//! This module binds directly to a browser-side HTML canvas, bypassing the DOM 
//! completely to perform high-throughput GPU instanced drawing.

use wasm_bindgen::prelude::*;
use charton::prelude::*;

/// Render a massive dataset straight to a canvas using WGPU hardware acceleration.
///
/// # Arguments
/// * `canvas_id` - The DOM ID of the target HTML `<canvas>` element.
/// * `xs` - X-axis data buffer
/// * `ys` - Y-axis data buffer
/// * `colors` - Continuous scale data mapped to colors
#[wasm_bindgen]
pub async fn render_chart_gpu(
    canvas_id: String,
    xs: Vec<f64>,
    ys: Vec<f64>,
    colors: Vec<f64>,
) -> Result<(), JsValue> {
    // 1. Build a high-capacity Charton Dataset from inputs
    let ds = Dataset::new()
        .with_column("x", xs)
        .map_err(|e| JsValue::from_str(&e.to_string()))?
        .with_column("y", ys)
        .map_err(|e| JsValue::from_str(&e.to_string()))?
        .with_column("color", colors)
        .map_err(|e| JsValue::from_str(&e.to_string()))?;

    // 2. Formulate the declarative chart specifications
    let chart = Chart::build(ds)
        .map_err(|e| JsValue::from_str(&e.to_string()))?
        .mark_point() // Leverages pure GPU Instancing under the hood
        .map_err(|e| JsValue::from_str(&e.to_string()))?
        .configure_point(|p| p.with_size(1.0))
        .encode((
            alt::x("x"),
            alt::y("y"),
            alt::color("color"),
        ))
        .map_err(|e| JsValue::from_str(&e.to_string()))?
        .with_size(800, 400)
        .configure_theme(|t| {
            t.with_left_margin(0.01)
             .with_top_margin(0.12)
             .with_bottom_margin(0.05)
        });

    // 3. Drive the hybrid orchestration pipeline (WGPU Geometry + Canvas 2D Text)
    // This executes the safe OffscreenCanvas blit and deferred ledger rendering natively.
    chart
        .render_to_canvas(&canvas_id)
        .await
        .map_err(|e| JsValue::from_str(&e.to_string()))?;

    Ok(())
}
}

4) Build with wasm-pack

Recompile the project to re-generate your WebAssembly target and JavaScript glue bindings:

wasm-pack build --release --target web --out-dir pkg

5) index.html - The Million-Point Stress Test

Now let's construct the interactive frontend wrapper. To demonstrate the crushing performance advantage of the GPU, we will configure the simulation loop to initialize and render 50,000 points simultaneously right from the start, update a sliding window at 60 frames per second, and benchmark the frame timing.

Create or update index.html in your project root:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Charton WASM — GPU WGPU Pipeline</title>
    <style>
        body {
            font-family: system-ui, sans-serif;
            display: flex;
            flex-direction: column;
            align-items: center;
            margin: 0;
            padding-top: 2rem;
            background: #0d1117;
            color: #c9d1d9;
        }
        #canvas-container {
            position: relative;
            width: 800px;
            height: 400px;
            border-radius: 12px;
            background: #161b22;
            border: 1px solid #30363d;
            box-shadow: 0 4px 16px rgba(0,0,0,0.6);
            overflow: hidden;
        }
        /* High-performance hardware canvas surface */
        #chart-canvas {
            width: 800px;
            height: 400px;
            display: block;
        }
        .tag {
            margin-top: 1rem;
            font-size: 0.9rem;
            color: #8b949e;
        }
        #fps-counter {
            font-weight: bold;
            color: #58a6ff;
        }
    </style>
</head>
<body>
    <h2>⚡ Charton + WGPU — GPU Hardware Multi-Point Stress Test</h2>
    <div id="canvas-container">
        <canvas id="chart-canvas"></canvas>
    </div>
    <div class="tag">
        Active GPU Primitives: <span id="point-count">0</span> points | 
        Performance: <span id="fps-counter">Calculating...</span>
    </div>

    <script type="module">
        import init, { render_chart_gpu } from './pkg/wave.js';

        async function run() {
            // Boot the compiled WebAssembly package
            await init();

            const pointCountElement = document.getElementById('point-count');
            const fpsCounterElement = document.getElementById('fps-counter');

            // --- STRESS TEST CONFIGURATION ---
            const TOTAL_POINTS = 50_000; 
            
            let xs = new Float64Array(TOTAL_POINTS);
            let ys = new Float64Array(TOTAL_POINTS);
            let colors = new Float64Array(TOTAL_POINTS);

            // Populate initial massive data buffers (Complex overlapping waves)
            for (let i = 0; i < TOTAL_POINTS; i++) {
                xs[i] = i * 0.01;
                ys[i] = Math.sin(i * 0.05) * Math.cos(i * 0.002);
                colors[i] = ys[i];
            }
            
            pointCountElement.textContent = TOTAL_POINTS.toLocaleString();

            let lastCalledTime;
            let fps;
            let offset = 0;

            // Locked 60 FPS requestAnimationFrame simulation loop
            async function frameLoop() {
                // Calculate real-time FPS metrics
                if (!lastCalledTime) {
                    lastCalledTime = performance.now();
                    fps = 0;
                } else {
                    let delta = (performance.now() - lastCalledTime) / 1000;
                    lastCalledTime = performance.now();
                    fps = Math.round(1 / delta);
                    fpsCounterElement.textContent = `${fps} FPS`;
                }

                // Dynamic simulation shift: mutate data values slightly over time
                offset += 0.02;
                for (let i = 0; i < TOTAL_POINTS; i++) {
                    // Update y and color buffers dynamically per frame
                    ys[i] = Math.sin(i * 0.05 + offset) * Math.cos(i * 0.002 + offset * 0.5);
                    colors[i] = ys[i];
                }

                try {
                    // Direct zero-copy data streaming from JS to WGPU vertex pipelines
                    await render_chart_gpu("chart-canvas", xs, ys, colors);
                } catch (e) {
                    console.error("GPU Rendering Error:", e);
                }

                requestAnimationFrame(frameLoop);
            }

            // Fire up the GPU engine
            requestAnimationFrame(frameLoop);
        }

        run();
    </script>
</body>
</html>

6) Run and Compare

Start your local static HTTP server:

python -m http.server 8080

What you are witnessing:

  • Look closely at the data points and lines. You will notice 50,000 independent data points floating and flowing simultaneously across the screen.
  • Look at the performance metrics indicator. Despite rendering 500 times more data than the SVG example in Part 1, the framerate remains pinned at a rock-solid, buttery-smooth 60 FPS.
  • Look at the chart labels and axes tick marks. Thanks to the Deferred Ledger Design, text anti-aliasing remains flawlessly sharp, crisp, and beautifully proportioned regardless of your monitor's DPI scaling, without embedding a massive 5MB .ttf font compiler inside your WASM package.

7) Conclusion: The Power of Hybrid Architecture

By splitting the rendering pipeline into a Pure GPU Instancing Layer for Primitives and a Deferred Subsystem for Typography, charton delivers the best of both worlds.

You have just successfully designed and executed an industrial-grade visualization engine structure engineered to effortlessly handle real-time Big Data visualizations directly inside native web architectures.

This example serves as a fundamental demonstration of Charton’s core usage within WebAssembly; advanced high-performance patterns are covered in the Case Studies chapter.

Seamless Python Interop

In many data workflows, you may need to leverage the rich ecosystem of Python visualization libraries like Altair (for declarative Vega-Lite specifications) or Matplotlib (for procedural, publication-quality graphics). Charton handles this through the IPC (Inter-Process Communication) Bridge, a specialized layer that bridges the performance gap between Rust and Python without sacrificing type safety or data integrity.

The Architecture of the Bridge

The bridge is designed as a generic, extensible layer. At its core, it relies on two primary abstractions:

  • InputData: The serialization envelope. It captures a Polars DataFrame and wraps it with metadata (a string identifier) to ensure the Python environment knows exactly how to map the incoming data into a Pandas DataFrame.

  • Renderer Trait: A common interface that defines how data is transmitted and how plotting code is executed. Whether you use Altair or Matplotlib, the bridge handles the data transfer in a consistent manner.

Data Transfer: The Serialization Pipeline

Transferring data between Rust and Python is notoriously expensive if done naively. The bridge employs a high-throughput IPC (Inter-Process Communication) format based on the Apache Arrow specification via Polars:

  • Serialization: Instead of converting data to plain JSON strings, the system serializes the DataFrame into an Arrow IPC stream.
  • Binary Transfer: This stream is encoded into Base64 (to ensure reliable transit over standard input/output) and passed into the Python process.
  • Zero-Copy Reconstitution: On the Python side, the data is decoded and reconstituted into a Pandas DataFrame using pl.read_ipc. This approach preserves the data types and schema of the original Rust DataFrame with near-zero overhead.

Execution Logic: The Plotting Sandbox

The Bridge executes plotting code within a sandboxed Python sub-process. This ensures that the Python execution environment remains completely decoupled from the Rust main thread.

The Rendering Lifecycle

  1. Instruction Preparation: The Rust side generates a full Python script. This script contains three parts:
    • Data Ingestion: A boilerplate header that listens to stdin and performs the read_ipc conversion.
    • User Code: The actual plotting logic defined by the user (e.g., calling alt.Chart() or plt.plot()).
    • Output Handling: A bridge-specific snippet that serializes the resulting figure (as PNG, SVG, or JSON) and writes it back to stdout.
  2. Process Execution: The ExternalRendererExecutor forks a Python sub-process, passing the generated code and binary data via standard streams.
  3. Stream Recomposition: The main Rust process collects the resulting Base64-encoded binary from stdout, decodes it, and returns it to the user.

Generic Renderer Implementation

The design uses Rust’s PhantomData to maintain type safety across different renderers. This means the compiler knows precisely which rendering logic you are using before the code even runs.

#![allow(unused)]
fn main() {
// The Plot struct manages the data and the renderer identity
pub struct Plot<T: Renderer> {
    pub(crate) data: SerializedData,
    pub(crate) raw_plotting_code: String,
    pub(crate) _renderer: PhantomData<T>,
}
}

By decoupling the Plot struct from the specific renderer T, you can swap between Altair and Matplotlib by changing a single type parameter. Each renderer then implements its own specialized code generation:

  • Altair: The bridge targets JSON/SVG outputs, making it ideal for web-embedded visualizations.
  • Matplotlib: The bridge targets binary formats like PNG, perfect for high-resolution static exports.

Why an IPC Bridge?

Ecosystem Access: You get the performance of Rust data processing with the mature rendering power of Python’s massive visualization library ecosystem.

Process Isolation: If the Python plotting code crashes (e.g., due to memory issues or library errors), your main Rust application remains stable and continues running.

Schema Integrity: By utilizing Arrow IPC, the Bridge guarantees that date, time, and numeric types in Rust are accurately represented in Python, eliminating the common "type-mismatch" bugs seen in simpler JSON-based bridges.

Key Takeaways

  • High-Throughput: IPC-based binary transfer makes sharing large datasets between languages efficient.
  • Type Safety: Rust's PhantomData ensures the renderer-specific logic is validated at compile-time.
  • Isolation: The Bridge treats Python as an external, sandboxed rendering service, keeping your primary application robust and fast.

WGPU TEXT

纯wgpu文本渲染是非常难的,这是一个不太成功的尝试,能显示文本,但位置、色彩、清晰度、角度都不太理想,作为备份放在这儿了,以备以后参考。

chart.wgsl

// ============================================================================
// Charton WGPU Shader: Unified Rendering Primitives
// Primitives: Circle, Rect, Line, Polygon, GradientRect, Path(Polyline/Area)
// Strictly Compliant with RenderBackend Contract
// ============================================================================

// ----------------------------------------------------------------------------
// Storage Buffer Data Structures (Semantically Separated)
// ----------------------------------------------------------------------------

/// Circle data (draw_circle: exclusive for circular markers/points)
struct PointData {
    x: f32,
    y: f32,
    r: f32,
    g: f32,
    b: f32,
    a: f32,
    radius: f32,
};

/// Rectangle data (draw_rect: bars, boxes, axis backgrounds)
struct RectData {
    x: f32,
    y: f32,
    width: f32,
    height: f32,
    r: f32,
    g: f32,
    b: f32,
    a: f32,
    corner_radius: f32,
};

/// Single line segment data (draw_line: axis, grid, ticks, whiskers)
struct LineData {
    x1: f32,
    y1: f32,
    x2: f32,
    y2: f32,
    r: f32,
    g: f32,
    b: f32,
    a: f32,
    width: f32,
    pad1: f32,
    pad2: f32,
    pad3: f32,
};

/// Polygon data (draw_polygon: symmetric markers - triangle, hexagon, diamond, star)
/// Fed directly via traditional Vertex Buffer stream (no dedicated storage slot needed)
struct PolygonData {
    x: f32,
    y: f32,
    r: f32,
    g: f32,
    b: f32,
    a: f32,
    radius: f32,
    shape_type: f32,
};

/// Gradient rectangle data (draw_gradient_rect: heatmaps, themed panels)
struct GradientRectData {
    x: f32,
    y: f32,
    width: f32,
    height: f32,
    start_r: f32,
    start_g: f32,
    start_b: f32,
    start_a: f32,
    end_r: f32,
    end_g: f32,
    end_b: f32,
    end_a: f32,
    angle: f32,
    opacity: f32,
};

/// Individual Glyph Instance (Instanced Text Rendering)
/// Strictly maps to your CPU batching structure with strict alignment
struct GlyphInstanceData {
    x: f32,
    y: f32,
    width: f32,
    height: f32,
    uv_min_x: f32,
    uv_min_y: f32,
    uv_max_x: f32,
    uv_max_y: f32,
    r: f32,
    g: f32,
    b: f32,
    a: f32,
};

// ----------------------------------------------------------------------------
// Uniform Buffer (Global Render State)
// ----------------------------------------------------------------------------
struct Uniforms {
    screen_width: f32,
    screen_height: f32,
    scale_factor: f32,
    _padding: f32,
};

// ============================================================================
// Resource Bind Group Layouts
// ============================================================================

// ----------------------------------------------------------------------------
// Group 0: Global Instanced & Batched Primitives (Kept intact to preserve indices)
// ----------------------------------------------------------------------------
@group(0) @binding(0) var<storage, read> circles: array<PointData>;
@group(0) @binding(1) var<storage, read> rects: array<RectData>;
@group(0) @binding(2) var<storage, read> lines: array<LineData>;
// Note: binding(3) is skipped intentionally to match the traditional Vertex Buffer Polygon input
@group(0) @binding(4) var<storage, read> gradient_rects: array<GradientRectData>;
@group(0) @binding(5) var<uniform> uniforms: Uniforms;

// ----------------------------------------------------------------------------
// Group 1: Dedicated High-Throughput Stream (Exclusive for Pure GPU Line Extrusion)
// ----------------------------------------------------------------------------

/// Represents a raw coordinate vertex along the dynamic polyline path
struct PathPointData {
    x: f32,
    y: f32,
};

/// Global rendering aesthetics for the paths (Shifted to Storage-compliant layout)
struct PathStyle {
    r: f32,
    g: f32,
    b: f32,
    a: f32,
    thickness: f32,
    _pad0: f32, // Padding fields to guarantee 16-byte structural boundaries
    _pad1: f32,
    _pad2: f32,
};

/// Structural draw arguments routing layout for monolithic global lookup
struct PathArgs {
    start_point_idx: u32,
    style_idx: u32,
    _pad0: u32, // Structural padding fields
    _pad1: u32,
};

// Group 1: High-Throughput Stream (Pure GPU Line Extrusion)
@group(1) @binding(0) var<storage, read> path_points: array<PathPointData>;
@group(1) @binding(1) var<storage, read> path_styles: array<PathStyle>;
@group(1) @binding(2) var<storage, read> path_args: array<PathArgs>;

// Group 2: High-Performance Font Cache & Instance Stream
// Maps perfectly to the architecture requested in your wgpu.rs note
@group(2) @binding(0) var<storage, read> glyph_instances: array<GlyphInstanceData>;
@group(2) @binding(1) var font_atlas_texture: texture_2d<f32>;
@group(2) @binding(2) var font_atlas_sampler: sampler;

// ---------------------------
// Vertex Output Structures
// ---------------------------
struct CircleOutput {
    @builtin(position) clip_pos: vec4<f32>,
    @location(0) screen_pos: vec2<f32>,
    @location(1) @interpolate(flat) instance_idx: u32,
};

struct RectOutput {
    @builtin(position) clip_pos: vec4<f32>,
    @location(0) screen_pos: vec2<f32>,
    @location(1) @interpolate(flat) instance_idx: u32,
};

struct LineOutput {
    @builtin(position) clip_pos: vec4<f32>,
    @location(0) @interpolate(flat) instance_idx: u32,
};

struct PolygonOutput {
    @builtin(position) clip_pos: vec4<f32>,
    @location(0) color: vec4<f32>,
};

struct GradientRectOutput {
    @builtin(position) clip_pos: vec4<f32>,
    @location(0) uv: vec2<f32>,
    @location(1) @interpolate(flat) instance_idx: u32,
};

/// Output structure passed from the vertex shader through rasterization into the fragment shader
struct PathOutput {
    @builtin(position) clip_pos: vec4<f32>,
    @location(0) color: vec4<f32>,
};

// Text Output
struct TextOutput {
    @builtin(position) clip_pos: vec4<f32>,
    @location(0) uv: vec2<f32>,
    @location(1) color: vec4<f32>,
};

// ============================================================================
// Analytical SDF Implementation – ONLY for Circle
// All other shapes (triangle, star, hexagon, etc.) use CPU-generated vertices.
// ============================================================================

/// Signed Distance Field for a perfect circle.
/// p: Local fragment position relative to shape center
/// r: Radius of the circle
fn sd_circle(p: vec2<f32>, r: f32) -> f32 {
    return length(p) - r;
}

// ---------------------------
// 1. Circle Pipeline (draw_circle: Scatter Plot Markers)
// ---------------------------
@vertex
fn circle_vs(@builtin(vertex_index) vi: u32, @builtin(instance_index) ii: u32) -> CircleOutput {
    var quad = vec2<f32>();
    switch vi {
        case 0u: { quad = vec2(-1.0, -1.0); }
        case 1u: { quad = vec2(1.0, -1.0); }
        case 2u: { quad = vec2(-1.0, 1.0); }
        case 3u: { quad = vec2(1.0, 1.0); }
        default: { quad = vec2(0.0); }
    }

    let scale = uniforms.scale_factor;
    let circle = circles[ii];
    // Use a slightly larger quad than the circle itself to avoid clipping SDF anti-aliasing.
    let final_pos = vec2(circle.x, circle.y) * scale + quad * (circle.radius * 1.5 * scale);
    let sw = uniforms.screen_width * scale;
    let sh = uniforms.screen_height * scale;
    let ndc = vec4((final_pos.x/sw)*2.0-1.0, 1.0-(final_pos.y/sh)*2.0, 0.0, 1.0);

    var out: CircleOutput;
    out.clip_pos = ndc;
    out.screen_pos = final_pos;
    out.instance_idx = ii;
    return out;
}

@fragment
fn circle_fs(in: CircleOutput) -> @location(0) vec4<f32> {
    let circle = circles[in.instance_idx];
    let local = in.screen_pos - vec2(circle.x, circle.y) * uniforms.scale_factor;
    let r = circle.radius * uniforms.scale_factor;
    let dist = sd_circle(local, r);
    
    // Smooth anti-aliasing
    let aa = fwidth(dist);
    let alpha = 1.0 - smoothstep(-aa, aa, dist);
    if (alpha <= 0.01) { discard; }
    
    // Swap Red and Blue channels to match the Bgra8Unorm surface format.
    return vec4(circle.b, circle.g, circle.r, circle.a * alpha);
}

// ---------------------------
// 2. Rectangle Pipeline (draw_rect: Pure Filled Bars/Boxes/Backgrounds)
// ---------------------------
@vertex
fn rect_vs(@builtin(vertex_index) vi: u32, @builtin(instance_index) ii: u32) -> RectOutput {
    let r = rects[ii];
    var pos = vec2<f32>();
    
    // Strictly align with the actual rectangle bounds without any inflation for perfect hardware rasterization.
    switch vi {
        case 0u: { pos = vec2(r.x, r.y); }
        case 1u: { pos = vec2(r.x + r.width, r.y); }
        case 2u: { pos = vec2(r.x, r.y + r.height); }
        case 3u: { pos = vec2(r.x + r.width, r.y + r.height); }
        default: { pos = vec2(r.x, r.y); }
    }

    let scale = uniforms.scale_factor;
    let screen_pos = pos * scale;
    let sw = uniforms.screen_width * scale;
    let sh = uniforms.screen_height * scale;
    let ndc = vec4((screen_pos.x / sw) * 2.0 - 1.0, 1.0 - (screen_pos.y / sh) * 2.0, 0.0, 1.0);

    var out: RectOutput;
    out.clip_pos = ndc;
    out.screen_pos = screen_pos;
    out.instance_idx = ii;
    return out;
}

@fragment
fn rect_fs(in: RectOutput) -> @location(0) vec4<f32> {
    let r = rects[in.instance_idx];
    
    // Removed all bounds checks and discards to maximize GPU fill-rate performance.
    // Swap Red and Blue channels to match the Bgra8Unorm surface format.
    return vec4(r.b, r.g, r.r, r.a);
}

// ---------------------------
// 3. Line Segment Pipeline (draw_line: Axis/Grid/Ticks)
// ---------------------------
@vertex
fn line_vs(
    @builtin(vertex_index) vi: u32,       // Current vertex index within the primitive (0 to 3 for a quad)
    @builtin(instance_index) ii: u32      // Index of the current line segment in the Storage Buffer
) -> LineOutput {
    // 1. Fetch data and apply High-DPI / Retargeting scaling factor
    let line = lines[ii];
    let scale = uniforms.scale_factor;
    let p1 = vec2(line.x1, line.y1) * scale;
    let p2 = vec2(line.x2, line.y2) * scale;
    
    // 2. Compute direction vector with a safety guard against zero-length segments (prevents NaN)
    var dir = p2 - p1;
    if (length(dir) < 0.0001) {
        dir = vec2<f32>(1.0, 0.0); // Fallback direction to prevent division by zero
    }
    dir = normalize(dir);
    
    // 3. Calculate perpendicular normal vector, scaled by half-width to project outward
    let perp = vec2(-dir.y, dir.x) * (line.width * 0.5 * scale);

    // 4. Extrude vertices dynamically on-chip using TriangleStrip topology
    var pos = vec2<f32>();
    switch vi {
        case 0u: { pos = p1 + perp; } // Start point: left expansion
        case 1u: { pos = p1 - perp; } // Start point: right expansion
        case 2u: { pos = p2 + perp; } // End point: left expansion
        case 3u: { pos = p2 - perp; } // End point: right expansion
        default: { pos = p1; }
    }

    // 5. Convert screen-space pixel coordinates to Normalized Device Coordinates (NDC)
    let sw = uniforms.screen_width * scale;
    let sh = uniforms.screen_height * scale;
    // Map X to [-1, 1], and invert Y axis to match WebGPU specifications
    let ndc = vec4((pos.x / sw) * 2.0 - 1.0, 1.0 - (pos.y / sh) * 2.0, 0.0, 1.0);

    // 6. Assemble output payload for the rasterizer
    var out: LineOutput;
    out.clip_pos = ndc;
    out.instance_idx = ii; // Forward instance ID so the Fragment Shader can resolve colors
    return out;
}

@fragment
fn line_fs(in: LineOutput) -> @location(0) vec4<f32> {
    let line = lines[in.instance_idx];
    // Swap Red and Blue channels to match the Bgra8Unorm surface format.
    return vec4(line.b, line.g, line.r, line.a);
}

// ---------------------------
// 4. Polygon Pipeline (draw_polygon: triangle/star/diamond/hexagon etc.)
// Receives CPU-precomputed vertices - NO GPU-side shape generation
// ---------------------------
@vertex
fn polygon_vs(
    @location(0) position: vec2<f32>,
    @location(1) color: vec4<f32>,
    @location(2) is_fill: f32
) -> PolygonOutput {
    let scale = uniforms.scale_factor;
    let screen_pos = position * scale;
    let sw = uniforms.screen_width * scale;
    let sh = uniforms.screen_height * scale;
    let ndc = vec4((screen_pos.x/sw)*2.0-1.0, 1.0-(screen_pos.y/sh)*2.0, 0.0, 1.0);

    var out: PolygonOutput;
    out.clip_pos = ndc;
    out.color = color;
    return out;
}

@fragment
fn polygon_fs(in: PolygonOutput) -> @location(0) vec4<f32> {
    return vec4(in.color.b, in.color.g, in.color.r, in.color.a);
}

// ---------------------------
// 5. Gradient Rectangle Pipeline (draw_gradient_rect)
// ---------------------------
@vertex
fn grad_rect_vs(@builtin(vertex_index) vi: u32, @builtin(instance_index) ii: u32) -> GradientRectOutput {
    let r = gradient_rects[ii];
    var quad = vec2<f32>();
    var uv = vec2<f32>();
    switch vi {
        case 0u: { quad = vec2(r.x, r.y); uv = vec2(0.0, 0.0); }
        case 1u: { quad = vec2(r.x + r.width, r.y); uv = vec2(1.0, 0.0); }
        case 2u: { quad = vec2(r.x, r.y + r.height); uv = vec2(0.0, 1.0); }
        case 3u: { quad = vec2(r.x + r.width, r.y + r.height); uv = vec2(1.0, 1.0); }
        default: { quad = vec2(r.x, r.y); }
    }

    let scale = uniforms.scale_factor;
    let screen_pos = quad * scale;
    let sw = uniforms.screen_width * scale;
    let sh = uniforms.screen_height * scale;
    let ndc = vec4((screen_pos.x/sw)*2.0-1.0, 1.0-(screen_pos.y/sh)*2.0, 0.0, 1.0);

    var out: GradientRectOutput;
    out.clip_pos = ndc;
    out.uv = uv;
    out.instance_idx = ii;
    return out;
}

@fragment
fn grad_rect_fs(in: GradientRectOutput) -> @location(0) vec4<f32> {
    let r = gradient_rects[in.instance_idx];
    let mix_val = in.uv.x;
    return vec4(
        mix(r.start_r, r.end_r, mix_val),
        mix(r.start_g, r.end_g, mix_val),
        mix(r.start_b, r.end_b, mix_val),
        mix(r.start_a, r.end_a, mix_val) * r.opacity
    );
}

// ============================================================================
// 6. Pure GPU Polyline Extrusion Pipeline (draw_path - Simple Branch)
// ============================================================================
@vertex
fn path_simple_vs(
    @builtin(vertex_index) vi: u32,
    @builtin(instance_index) ii: u32 // 🌟 Forwarded path_idx from pass.draw boundaries
) -> PathOutput {
    // 1. Resolve monolithic absolute addresses via the routing table
    let args = path_args[ii];
    let path_style = path_styles[args.style_idx];

    // Each line segment (quad) is constructed via 6 virtual vertices (2 triangles)
    let segment_idx = vi / 6u;
    let local_vertex_idx = vi % 6u;

    // Fetch p0 and p1 using calculated global offsets from the streaming queues
    let p0_idx = args.start_point_idx + segment_idx;
    let p1_idx = p0_idx + 1u;

    let p0 = path_points[p0_idx];
    let p1 = path_points[p1_idx];

    // Data defense: if any coordinate is NaN, collapse the triangle to eliminate rendering artifacts
    if (p0.x != p0.x || p0.y != p0.y || p1.x != p1.x || p1.y != p1.y) {
        var out: PathOutput;
        out.clip_pos = vec4<f32>(0.0, 0.0, 0.0, 0.0);
        return out;
    }

    // Calculate the direction vector of the current segment
    let delta = vec2<f32>(p1.x - p0.x, p1.y - p0.y);
    var current_dir = normalize(delta);
    
    // Prevent division-by-zero if the two points overlap perfectly
    if (length(delta) == 0.0) {
        current_dir = vec2<f32>(1.0, 0.0);
    }
    
    // Calculate the right-hand orthogonal normal vector
    let normal = vec2<f32>(-current_dir.y, current_dir.x);

    var raw_pos = vec2<f32>(0.0, 0.0);
    var extrusion_side = 0.0; // 1.0 extends along the normal, -1.0 extends opposite to the normal

    // Finite state machine: Map the 6 virtual vertices to a structured Triangle List Quad
    switch local_vertex_idx {
        case 0u: { raw_pos = vec2(p0.x, p0.y); extrusion_side = 1.0; }  // p0 Left
        case 1u: { raw_pos = vec2(p0.x, p0.y); extrusion_side = -1.0; } // p0 Right
        case 2u: { raw_pos = vec2(p1.x, p1.y); extrusion_side = 1.0; }  // p1 Left
        
        case 3u: { raw_pos = vec2(p1.x, p1.y); extrusion_side = 1.0; }  // p1 Left
        case 4u: { raw_pos = vec2(p0.x, p0.y); extrusion_side = -1.0; } // p0 Right
        case 5u: { raw_pos = vec2(p1.x, p1.y); extrusion_side = -1.0; } // p1 Right
        default: {}
    }

    // Extrude outward by half of the thickness in physical pixel space
    let ext_pos = raw_pos + normal * (path_style.thickness * 0.5 * extrusion_side);

    // Coordinate transformation into global uniform space
    let scale = uniforms.scale_factor;
    let screen_pos = ext_pos * scale;
    let sw = uniforms.screen_width * scale;
    let sh = uniforms.screen_height * scale;

    // Standard Normalized Device Coordinate (NDC) conversion with Y-axis inversion
    let ndc_x = (screen_pos.x / sw) * 2.0 - 1.0;
    let ndc_y = 1.0 - (screen_pos.y / sh) * 2.0;

    var out: PathOutput;
    out.clip_pos = vec4<f32>(ndc_x, ndc_y, 0.0, 1.0);
    out.color = vec4<f32>(path_style.b, path_style.g, path_style.r, path_style.a);
    return out;
}

@fragment
fn path_simple_fs(in: PathOutput) -> @location(0) vec4<f32> {
    // Strictly adheres to semantic separation contract
    return in.color;
}

// ---------------------------
// 7. Text Pipeline (draw_text)
// ---------------------------
// Text Instanced Vertex Shader
// Processes full string blocks into single-pass hardware drawn instanced quads
@vertex
fn vs_text(
    @builtin(vertex_index) vertex_idx: u32,
    @builtin(instance_index) instance_idx: u32,
) -> TextOutput {
    let glyph = glyph_instances[instance_idx];
    
    // Generate a standard quad using vertex_idx (0 to 5)
    var local_pos: vec2<f32>;
    var uv: vec2<f32>;
    
    switch vertex_idx {
        case 0u: { local_pos = vec2(0.0, 0.0); uv = vec2(glyph.uv_min_x, glyph.uv_min_y); } // Top-Left
        case 1u: { local_pos = vec2(0.0, 1.0); uv = vec2(glyph.uv_min_x, glyph.uv_max_y); } // Bottom-Left
        case 2u: { local_pos = vec2(1.0, 0.0); uv = vec2(glyph.uv_max_x, glyph.uv_min_y); } // Top-Right
        
        case 3u: { local_pos = vec2(1.0, 0.0); uv = vec2(glyph.uv_max_x, glyph.uv_min_y); } // Top-Right
        case 4u: { local_pos = vec2(0.0, 1.0); uv = vec2(glyph.uv_min_x, glyph.uv_max_y); } // Bottom-Left
        case 5u: { local_pos = vec2(1.0, 1.0); uv = vec2(glyph.uv_max_x, glyph.uv_max_y); } // Bottom-Right
        default: {}
    }
    
    // Compute absolute screen position
    let world_pos = vec2<f32>(
        glyph.x + local_pos.x * glyph.width,
        glyph.y + local_pos.y * glyph.height
    );
    
    // Coordinate transformation to Clip Space
    let scale = uniforms.scale_factor;
    let screen_pos = world_pos * scale;
    let sw = uniforms.screen_width * scale;
    let sh = uniforms.screen_height * scale;
    
    var out: TextOutput;
    out.clip_pos = vec4<f32>(
        (screen_pos.x / sw) * 2.0 - 1.0,
        1.0 - (screen_pos.y / sh) * 2.0,
        0.0,
        1.0
    );
    out.uv = uv;
    out.color = vec4<f32>(glyph.r, glyph.g, glyph.b, glyph.a);
    return out;
}

// Text Fragment Shader
// Highly optimized text sampler. Pulls alpha channel from texture cache and shades smoothly.
@fragment
fn fs_text(in: TextOutput) -> @location(0) vec4<f32> {
    // We sample a single channel texture (A8/R8Unorm) allocated as our font atlas cache
    let tex_color = textureSample(font_atlas_texture, font_atlas_sampler, in.uv);
    
    // Font atlas stores alpha inside the Red component (R8_Unorm format)
    let alpha_mask = tex_color.r;
    
    // Perform unified hardware tinting and alpha blending
    return vec4<f32>(in.color.rgb, in.color.a * alpha_mask);
}

wgpu.rs

#![allow(unused)]
fn main() {
//! WGPU rendering backend implementation for 2D primitive rendering (circles, lines, rects, polygons, gradients, text)
//! Provides GPU-optimized data structures and render pipelines aligned with WGSL shaders

use crate::core::layer::{
    CircleConfig, GradientRectConfig, LineConfig, PathConfig, PolygonConfig, RectConfig,
    RenderBackend, TextConfig,
};
use crate::visual::color::SingleColor;
use ab_glyph::{Font, FontArc, ScaleFont};
use bytemuck::{Pod, Zeroable};
use std::collections::HashMap;
use wgpu::util::DeviceExt;

// ============================================================================
// GPU Data Structures (Strict 1:1 WGSL Alignment - std140 layout)
// ============================================================================

/// Base data for instanced SDF (Signed Distance Field) primitives (matches PointData in WGSL)
/// All fields use f32 for consistent GPU memory alignment
#[repr(C)]
#[derive(Copy, Clone, Debug, Pod, Zeroable)]
pub struct GpuPoint {
    /// X coordinate of the primitive center (screen space)
    pub x: f32,
    /// Y coordinate of the primitive center (screen space)
    pub y: f32,
    /// Red color channel (0.0 - 1.0)
    pub r: f32,
    /// Green color channel (0.0 - 1.0)
    pub g: f32,
    /// Blue color channel (0.0 - 1.0)
    pub b: f32,
    /// Alpha transparency channel (0.0 - 1.0)
    pub a: f32,
    /// Radius of the SDF primitive (pixels)
    pub radius: f32,
}

/// GPU data structure for line primitives (matches LineData in WGSL)
#[repr(C)]
#[derive(Copy, Clone, Debug, Pod, Zeroable)]
pub struct GpuLine {
    /// Start X coordinate (screen space)
    pub x1: f32,
    /// Start Y coordinate (screen space)
    pub y1: f32,
    /// End X coordinate (screen space)
    pub x2: f32,
    /// End Y coordinate (screen space)
    pub y2: f32,
    /// Red color channel (0.0 - 1.0)
    pub r: f32,
    /// Green color channel (0.0 - 1.0)
    pub g: f32,
    /// Blue color channel (0.0 - 1.0)
    pub b: f32,
    /// Alpha transparency channel (0.0 - 1.0)
    pub a: f32,
    /// Line width (pixels)
    pub width: f32,
    /// Manual padding to ensure memory alignment
    pub _pad1: f32,
    pub _pad2: f32,
    pub _pad3: f32,
}

/// GPU data structure for rectangle primitives (matches RectData in WGSL)
#[repr(C)]
#[derive(Copy, Clone, Debug, Pod, Zeroable)]
pub struct GpuRect {
    /// Top-left X coordinate (screen space)
    pub x: f32,
    /// Top-left Y coordinate (screen space)
    pub y: f32,
    /// Rectangle width (pixels)
    pub width: f32,
    /// Rectangle height (pixels)
    pub height: f32,
    /// Red color channel (0.0 - 1.0)
    pub r: f32,
    /// Green color channel (0.0 - 1.0)
    pub g: f32,
    /// Blue color channel (0.0 - 1.0)
    pub b: f32,
    /// Alpha transparency channel (0.0 - 1.0)
    pub a: f32,
    /// Corner radius for rounded rectangles (pixels)
    pub corner_radius: f32,
}

/// GPU data structure for gradient-filled rectangles (matches GradientRectData in WGSL)
#[repr(C)]
#[derive(Copy, Clone, Debug, Pod, Zeroable)]
pub struct GpuGradientRect {
    /// Top-left X coordinate (screen space)
    pub x: f32,
    /// Top-left Y coordinate (screen space)
    pub y: f32,
    /// Rectangle width (pixels)
    pub width: f32,
    /// Rectangle height (pixels)
    pub height: f32,
    /// Start gradient red channel (0.0 - 1.0)
    start_r: f32,
    /// Start gradient green channel (0.0 - 1.0)
    start_g: f32,
    /// Start gradient blue channel (0.0 - 1.0)
    start_b: f32,
    /// Start gradient alpha channel (0.0 - 1.0)
    start_a: f32,
    /// End gradient red channel (0.0 - 1.0)
    end_r: f32,
    /// End gradient green channel (0.0 - 1.0)
    end_g: f32,
    /// End gradient blue channel (0.0 - 1.0)
    end_b: f32,
    /// End gradient alpha channel (0.0 - 1.0)
    end_a: f32,
    /// Gradient angle (radians) - 0 = horizontal, π/2 = vertical
    pub angle: f32,
    /// Overall opacity multiplier (0.0 - 1.0)
    pub opacity: f32,
}

// ----------------------------------------------------------------------------
// Pure GPU Polyline Extrusion Layouts (Group 1 Spec)
// ----------------------------------------------------------------------------

#[repr(C)]
#[derive(Copy, Clone, Debug, bytemuck::Pod, bytemuck::Zeroable)]
pub struct GpuPathPoint {
    pub x: f32,
    pub y: f32,
}

#[repr(C)]
#[derive(Copy, Clone, Debug, bytemuck::Pod, bytemuck::Zeroable)]
pub struct GpuPathStyle {
    pub r: f32,
    pub g: f32,
    pub b: f32,
    pub a: f32,
    pub thickness: f32,
    pub _pad0: f32, // Structural alignment padding (16-byte boundary)
    pub _pad1: f32,
    pub _pad2: f32,
}

/// Meta arguments guiding the Vertex Shader where to look up data
#[repr(C)]
#[derive(Copy, Clone, Debug, bytemuck::Pod, bytemuck::Zeroable)]
pub struct GpuPathArgs {
    pub start_point_idx: u32,
    pub style_idx: u32,
    pub _pad0: u32, // Structural alignment padding
    pub _pad1: u32,
}

/// Individual Glyph Instance Data (Strict 1:1 WGSL Alignment)
/// Matches `GlyphInstanceData` in chart.wgsl
#[repr(C)]
#[derive(Copy, Clone, Debug, Pod, Zeroable)]
pub struct GpuGlyph {
    pub x: f32,
    pub y: f32,
    pub width: f32,
    pub height: f32,
    pub uv_min_x: f32,
    pub uv_min_y: f32,
    pub uv_max_x: f32,
    pub uv_max_y: f32,
    pub r: f32,
    pub g: f32,
    pub b: f32,
    pub a: f32,
    // Bearing offsets: distance from pen origin (baseline) to glyph top-left
    pub o_x: f32,
    pub o_y: f32,
    pub _pad0: f32,
    pub _pad1: f32,
}

/// Helper structure to track metadata and texture coordinates
/// for a single rasterized glyph inside the global GPU font atlas.
#[derive(Clone, Debug)]
pub struct CachedGlyphInfo {
    /// Normalized UV coordinates of the top-left corner in the texture atlas [u, v]
    pub uv_min: [f32; 2],
    /// Normalized UV coordinates of the bottom-right corner in the texture atlas [u, v]
    pub uv_max: [f32; 2],
    /// Physical pixel width of the glyph rectangle bounding box
    pub width: f32,
    /// Physical pixel height of the glyph rectangle bounding box
    pub height: f32,
    /// Horizontal bearing offset (distance from pen origin to glyph left edge)
    pub o_x: f32,
    /// Vertical bearing offset (distance from pen origin to glyph top edge)
    pub o_y: f32,
}

/// Vertex data for path primitives (custom vector paths)
/// Contains position, color, and fill state for each path vertex
#[repr(C)]
#[derive(Copy, Clone, Debug, Pod, Zeroable)]
pub struct PathVertex {
    /// Path vertex position (x, y) in screen space
    pub position: [f32; 2],
    /// Path color (rgba, 0.0 - 1.0)
    pub color: [f32; 4],
    /// Fill state flag (1.0 = fill path, 0.0 = stroke only)
    pub is_fill: f32,
}

impl PathVertex {
    /// Vertex buffer layout descriptor for path rendering pipelines
    /// Matches shader input locations (0 = position, 1 = color, 2 = is_fill)
    pub const DESC: wgpu::VertexBufferLayout<'static> = wgpu::VertexBufferLayout {
        array_stride: std::mem::size_of::<Self>() as wgpu::BufferAddress,
        step_mode: wgpu::VertexStepMode::Vertex,
        attributes: &[
            wgpu::VertexAttribute {
                offset: 0,
                shader_location: 0,
                format: wgpu::VertexFormat::Float32x2,
            },
            wgpu::VertexAttribute {
                offset: std::mem::size_of::<[f32; 2]>() as wgpu::BufferAddress,
                shader_location: 1,
                format: wgpu::VertexFormat::Float32x4,
            },
            wgpu::VertexAttribute {
                offset: (std::mem::size_of::<[f32; 2]>() + std::mem::size_of::<[f32; 4]>())
                    as wgpu::BufferAddress,
                shader_location: 2,
                format: wgpu::VertexFormat::Float32,
            },
        ],
    };
}

/// Global uniform data for all shaders (strict std140 alignment)
/// Contains screen dimensions and scaling factors for coordinate normalization
#[repr(C)]
#[derive(Copy, Clone, Debug, Pod, Zeroable)]
pub struct Uniforms {
    /// Current screen width (pixels)
    pub screen_width: f32,
    /// Current screen height (pixels)
    pub screen_height: f32,
    /// UI scale factor (for high-DPI displays)
    pub scale_factor: f32,
    /// Padding to maintain 16-byte alignment (std140 requirement)
    pub _padding: f32,
}

#[derive(Debug, Clone, Copy)]
/// Represents a single rendering batch command in the interleaved queue.
/// This enum acts as the "Instruction Manual" for the renderer, defining
/// which pipeline to use and which data range to fetch from GPU buffers.
pub enum DrawBatch {
    /// Batch of circles to be rendered via instancing.
    /// 'start' is the offset in the circle instance buffer.
    /// 'count' is the number of circle instances to draw in this call.
    Circle {
        start: u32,
        count: u32,
    },
    Rect {
        start: u32,
        count: u32,
    },
    Line {
        start: u32,
        count: u32,
    },
    Polygon {
        index_start: u32,
        index_count: u32,
    },
    GradientRect {
        start: u32,
        count: u32,
    },
    /// Monolithic GPU indexed simple path batch token
    PathSimple {
        /// Lookup offset index into the global GpuPathArgs array
        path_idx: u32,
        /// Number of raw points contained within this single polyline
        point_count: u32,
    },
    Text {
        start: u32,
        count: u32,
    },
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
/// Helper enum to categorize batch types for matching.
pub enum BatchType {
    Circle,
    Rect,
    Line,
    Polygon,
    GradientRect,
    #[allow(dead_code)]
    PathSimple,
    Text,
}

// ============================================================================
// WGPU Backend Core Implementation
// ============================================================================

/// WGPU-based rendering backend for 2D primitive rendering
/// Manages GPU resources (pipelines, buffers, bind groups) and handles rendering commands
pub struct WgpuBackend {
    // Core WGPU device and queue
    device: wgpu::Device,
    queue: wgpu::Queue,

    // Global uniform buffer (screen dimensions, scale factor)
    uniform_buffer: wgpu::Buffer,

    // Main bind group (matches @group(0) in chart.wgsl)
    main_bind_group: wgpu::BindGroup,
    main_bind_group_layout: wgpu::BindGroupLayout,

    // Circle primitive resources
    circle_pipeline: wgpu::RenderPipeline,
    circle_buffer: wgpu::Buffer,
    pending_circles: Vec<GpuPoint>,
    uploaded_circle_count: u32,

    // Rectangle primitive resources
    rect_pipeline: wgpu::RenderPipeline,
    rect_buffer: wgpu::Buffer,
    pending_rects: Vec<GpuRect>,
    uploaded_rect_count: u32,

    // Line primitive resources
    line_pipeline: wgpu::RenderPipeline,
    line_buffer: wgpu::Buffer,
    pending_lines: Vec<GpuLine>,
    uploaded_line_count: u32,

    // Polygon primitive resources
    polygon_pipeline: wgpu::RenderPipeline,
    polygon_vertex_buffer: wgpu::Buffer,
    polygon_index_buffer: wgpu::Buffer,
    pending_polygon_vertices: Vec<PathVertex>,
    pending_polygon_indices: Vec<u16>,
    uploaded_polygon_index_count: u32,

    // Gradient rectangle resources
    gradient_rect_pipeline: wgpu::RenderPipeline,
    gradient_rect_buffer: wgpu::Buffer,
    pending_gradient_rects: Vec<GpuGradientRect>,
    uploaded_gradient_rect_count: u32,

    // Path primitive resources
    path_simple_pipeline: wgpu::RenderPipeline,
    path_bind_group_layout: wgpu::BindGroupLayout,
    pending_path_points: Vec<GpuPathPoint>,
    pending_path_styles: Vec<GpuPathStyle>,
    pending_path_args: Vec<GpuPathArgs>,

    // Specialized render pipeline for text instancing quads
    text_pipeline: wgpu::RenderPipeline,
    text_instance_buffer: wgpu::Buffer,
    pending_glyphs: Vec<GpuGlyph>,
    text_atlas_texture: wgpu::Texture,
    text_atlas_view: wgpu::TextureView,
    text_atlas_sampler: wgpu::Sampler,
    text_bind_group_layout: wgpu::BindGroupLayout,

    /// CPU font parser and glyph metric computer via ab_glyph
    font: FontArc,
    /// Fast deduplication dictionary map for caching character raster results
    font_cache: HashMap<(char, u32), CachedGlyphInfo>,
    /// Incremental texture coordinate trackers for custom packing layout
    atlas_current_x: u32,
    atlas_current_y: u32,
    atlas_max_row_height: u32,

    /// Interleaved batch queue to preserve rendering order
    batches: Vec<DrawBatch>,

    /// Running instance count for rendering primitives (used as buffer offset)
    current_circle_count: u32,
    current_rect_count: u32,
    current_line_count: u32,
    current_polygon_index_count: u32,
    current_grad_rect_count: u32,
    /// Total number of individual characters processed in the current frame
    current_glyph_count: u32,
    /// Device pixel ratio / HiDPI scale factor passed from caller
    scale_factor: f32,
}

impl WgpuBackend {
    /// Creates a new WGPU rendering backend with multi-group high-throughput pipelines.
    ///
    /// # Resource Binding Architecture:
    /// - `@group(0)`: Global Batched Primitives (Circles, Rectangles, Standard Lines, Uniforms)
    /// - `@group(1)`: Dedicated High-Throughput Stream (Pure GPU Path Extrusion via Raw Coordinates)
    /// - `@group(2)`: Dedicated Text & Glyph Atlas Stream (Reserved for future SDF/Atlas rendering)
    pub async fn new(
        device: wgpu::Device,
        queue: wgpu::Queue,
        screen_width: u32,
        screen_height: u32,
        scale_factor: f32,
    ) -> Self {
        // 1. Load WGSL shader module (chart.wgsl contains all primitive shaders)
        let shader = device.create_shader_module(wgpu::include_wgsl!("chart.wgsl"));

        // 2. Initialize global uniform buffer with screen dimensions and scaling factor
        let uniforms = Uniforms {
            screen_width: screen_width as f32,
            screen_height: screen_height as f32,
            scale_factor,
            _padding: 0.0,
        };
        let uniform_buffer = device.create_buffer_init(&wgpu::util::BufferInitDescriptor {
            label: Some("Uniform Buffer - Group 0 Binding 5"),
            contents: bytemuck::cast_slice(&[uniforms]),
            usage: wgpu::BufferUsages::UNIFORM | wgpu::BufferUsages::COPY_DST,
        });

        // ====================================================================
        // RESOURCE BIND GROUP LAYOUT DECLARATIONS (Hardware Contracts)
        // ====================================================================

        // Create main bind group layout (@group(0) in chart.wgsl)
        let main_bind_group_layout =
            device.create_bind_group_layout(&wgpu::BindGroupLayoutDescriptor {
                label: Some("Main Primitive Bind Group Layout 0"),
                entries: &[
                    wgpu::BindGroupLayoutEntry {
                        binding: 0,
                        visibility: wgpu::ShaderStages::VERTEX | wgpu::ShaderStages::FRAGMENT,
                        ty: wgpu::BindingType::Buffer {
                            ty: wgpu::BufferBindingType::Storage { read_only: true },
                            has_dynamic_offset: false,
                            min_binding_size: None,
                        },
                        count: None,
                    },
                    wgpu::BindGroupLayoutEntry {
                        binding: 1,
                        visibility: wgpu::ShaderStages::VERTEX | wgpu::ShaderStages::FRAGMENT,
                        ty: wgpu::BindingType::Buffer {
                            ty: wgpu::BufferBindingType::Storage { read_only: true },
                            has_dynamic_offset: false,
                            min_binding_size: None,
                        },
                        count: None,
                    },
                    wgpu::BindGroupLayoutEntry {
                        binding: 2,
                        visibility: wgpu::ShaderStages::VERTEX | wgpu::ShaderStages::FRAGMENT,
                        ty: wgpu::BindingType::Buffer {
                            ty: wgpu::BufferBindingType::Storage { read_only: true },
                            has_dynamic_offset: false,
                            min_binding_size: None,
                        },
                        count: None,
                    },
                    wgpu::BindGroupLayoutEntry {
                        binding: 4,
                        visibility: wgpu::ShaderStages::VERTEX | wgpu::ShaderStages::FRAGMENT,
                        ty: wgpu::BindingType::Buffer {
                            ty: wgpu::BufferBindingType::Storage { read_only: true },
                            has_dynamic_offset: false,
                            min_binding_size: None,
                        },
                        count: None,
                    },
                    wgpu::BindGroupLayoutEntry {
                        binding: 5,
                        visibility: wgpu::ShaderStages::VERTEX | wgpu::ShaderStages::FRAGMENT,
                        ty: wgpu::BindingType::Buffer {
                            ty: wgpu::BufferBindingType::Uniform,
                            has_dynamic_offset: false,
                            min_binding_size: None,
                        },
                        count: None,
                    },
                ],
            });

        // Create high-throughput path stream bind group layout (@group(1) in chart.wgsl)
        let path_bind_group_layout =
            device.create_bind_group_layout(&wgpu::BindGroupLayoutDescriptor {
                label: Some("Path Global Storage Bind Group Layout 1"),
                entries: &[
                    wgpu::BindGroupLayoutEntry {
                        binding: 0,
                        visibility: wgpu::ShaderStages::VERTEX,
                        ty: wgpu::BindingType::Buffer {
                            ty: wgpu::BufferBindingType::Storage { read_only: true },
                            has_dynamic_offset: false,
                            min_binding_size: None,
                        },
                        count: None,
                    },
                    wgpu::BindGroupLayoutEntry {
                        binding: 1,
                        visibility: wgpu::ShaderStages::VERTEX,
                        ty: wgpu::BindingType::Buffer {
                            ty: wgpu::BufferBindingType::Storage { read_only: true },
                            has_dynamic_offset: false,
                            min_binding_size: None,
                        },
                        count: None,
                    },
                    wgpu::BindGroupLayoutEntry {
                        binding: 2,
                        visibility: wgpu::ShaderStages::VERTEX,
                        ty: wgpu::BindingType::Buffer {
                            ty: wgpu::BufferBindingType::Storage { read_only: true },
                            has_dynamic_offset: false,
                            min_binding_size: None,
                        },
                        count: None,
                    },
                ],
            });

        // 🌟 CRITICAL FIX: Create text pipeline and get the REAL text layout here,
        // BEFORE we assemble the global render_pipeline_layout.
        let (
            text_bind_group_layout,
            text_pipeline,
            text_atlas_texture,
            text_atlas_view,
            text_atlas_sampler,
        ) = Self::create_text_pipeline(&device, &main_bind_group_layout);

        // ====================================================================
        // GLOBAL PIPELINE LAYOUT ARCHITECTURE (Multi-Group Mapping)
        // ====================================================================
        let path_simple_pipeline_layout =
            device.create_pipeline_layout(&wgpu::PipelineLayoutDescriptor {
                label: Some("Path Simple Pipeline Layout"),
                bind_group_layouts: &[Some(&main_bind_group_layout), Some(&path_bind_group_layout)],
                immediate_size: 0,
            });

        // ====================================================================
        // RENDER PIPELINES COMPILATION
        // ====================================================================
        let circle_pipeline =
            Self::create_circle_pipeline(&device, &shader, &main_bind_group_layout);
        let (_line_bg_layout, line_pipeline) =
            Self::create_line_pipeline(&device, &shader, &main_bind_group_layout);
        let rect_pipeline = Self::create_rect_pipeline(&device, &shader, &main_bind_group_layout);
        let polygon_pipeline =
            Self::create_polygon_pipeline(&device, &shader, &main_bind_group_layout);
        let (_grad_bg_layout, gradient_rect_pipeline) =
            Self::create_gradient_rect_pipeline(&device, &shader, &main_bind_group_layout);

        // Compile the pure on-chip vertex generation line extrusion pipeline (Simple Path Pipeline)
        let texture_format = wgpu::TextureFormat::Rgba8Unorm;
        let path_simple_pipeline = device.create_render_pipeline(&wgpu::RenderPipelineDescriptor {
            label: Some("Path Simple Hardware Extrusion Pipeline"),
            layout: Some(&path_simple_pipeline_layout),
            vertex: wgpu::VertexState {
                module: &shader,
                entry_point: Some("path_simple_vs"),
                buffers: &[],
                compilation_options: Default::default(),
            },
            fragment: Some(wgpu::FragmentState {
                module: &shader,
                entry_point: Some("path_simple_fs"),
                targets: &[Some(wgpu::ColorTargetState {
                    format: texture_format,
                    blend: Some(wgpu::BlendState::ALPHA_BLENDING),
                    write_mask: wgpu::ColorWrites::ALL,
                })],
                compilation_options: Default::default(),
            }),
            primitive: wgpu::PrimitiveState {
                topology: wgpu::PrimitiveTopology::TriangleList,
                strip_index_format: None,
                front_face: wgpu::FrontFace::Ccw,
                cull_mode: None,
                unclipped_depth: false,
                polygon_mode: wgpu::PolygonMode::Fill,
                conservative: false,
            },
            depth_stencil: None,
            multisample: wgpu::MultisampleState::default(),
            multiview_mask: None,
            cache: None,
        });

        // ====================================================================
        // DEVICE STORAGE VRAM BACKING BUFFERS INITIAL ALLOCATIONS
        // ====================================================================
        let dummy_circles = Self::create_dummy_buffer::<GpuPoint>(&device);
        let dummy_rects = Self::create_dummy_buffer::<GpuRect>(&device);
        let dummy_lines = Self::create_dummy_buffer::<GpuLine>(&device);
        let dummy_grad_rects = Self::create_dummy_buffer::<GpuGradientRect>(&device);
        let dummy_polygon_vertices = Self::create_dummy_buffer::<PathVertex>(&device);
        let dummy_polygon_indices = Self::create_dummy_buffer::<u16>(&device);

        // Bind main resources
        let main_bind_group = device.create_bind_group(&wgpu::BindGroupDescriptor {
            label: Some("Main Primitive Bind Group 0"),
            layout: &main_bind_group_layout,
            entries: &[
                wgpu::BindGroupEntry {
                    binding: 0,
                    resource: wgpu::BindingResource::Buffer(dummy_circles.as_entire_buffer_binding()),
                },
                wgpu::BindGroupEntry {
                    binding: 1,
                    resource: wgpu::BindingResource::Buffer(dummy_rects.as_entire_buffer_binding()),
                },
                wgpu::BindGroupEntry {
                    binding: 2,
                    resource: wgpu::BindingResource::Buffer(dummy_lines.as_entire_buffer_binding()),
                },
                wgpu::BindGroupEntry {
                    binding: 4,
                    resource: wgpu::BindingResource::Buffer(dummy_grad_rects.as_entire_buffer_binding()),
                },
                wgpu::BindGroupEntry {
                    binding: 5,
                    resource: wgpu::BindingResource::Buffer(uniform_buffer.as_entire_buffer_binding()),
                },
            ],
        });

        // Pre-allocate GPU memory for text instances (up to 4096 characters per frame)
        let text_instance_buffer = device.create_buffer(&wgpu::BufferDescriptor {
            label: Some("Text Instance Storage Buffer"),
            size: (std::mem::size_of::<GpuGlyph>() * 4096) as wgpu::BufferAddress,
            usage: wgpu::BufferUsages::STORAGE | wgpu::BufferUsages::COPY_DST,
            mapped_at_creation: false,
        });

        // Load fallback scalable TTF font data from the embedded binary payload
        let font_bytes = include_bytes!("../../../assets/fonts/Inter-Regular.ttf");
        let font = ab_glyph::FontArc::try_from_slice(font_bytes)
            .expect("CRITICAL: Failed to parse embedded fallback TTF font data");

        // Initialize empty runtime glyph deduplication dictionary cache
        let font_cache = std::collections::HashMap::new();

        // ====================================================================
        // BACKEND STRUCTURE ASSEMBLY
        // ====================================================================
        Self {
            device,
            queue,
            uniform_buffer,
            main_bind_group,
            main_bind_group_layout,

            circle_pipeline,
            circle_buffer: dummy_circles,
            pending_circles: Vec::with_capacity(30_000),
            uploaded_circle_count: 0,

            rect_pipeline,
            rect_buffer: dummy_rects,
            pending_rects: Vec::with_capacity(10_000),
            uploaded_rect_count: 0,

            polygon_pipeline,
            polygon_vertex_buffer: dummy_polygon_vertices,
            polygon_index_buffer: dummy_polygon_indices,
            pending_polygon_vertices: Vec::with_capacity(50_000),
            pending_polygon_indices: Vec::with_capacity(100_000),
            uploaded_polygon_index_count: 0,

            line_pipeline,
            line_buffer: dummy_lines,
            pending_lines: Vec::with_capacity(10_000),
            uploaded_line_count: 0,

            gradient_rect_pipeline,
            gradient_rect_buffer: dummy_grad_rects,
            pending_gradient_rects: Vec::with_capacity(10_000),
            uploaded_gradient_rect_count: 0,

            path_simple_pipeline,
            path_bind_group_layout,
            pending_path_points: Vec::with_capacity(100_000),
            pending_path_styles: Vec::with_capacity(1024),
            pending_path_args: Vec::with_capacity(1024),

            text_pipeline,
            text_instance_buffer,
            pending_glyphs: Vec::with_capacity(512),
            text_atlas_texture,
            text_atlas_view,
            text_atlas_sampler,
            text_bind_group_layout,

            font,
            font_cache,
            atlas_current_x: 4,
            atlas_current_y: 4,
            atlas_max_row_height: 0,

            batches: Vec::with_capacity(1024),
            current_circle_count: 0,
            current_rect_count: 0,
            current_line_count: 0,
            current_polygon_index_count: 0,
            current_grad_rect_count: 0,
            current_glyph_count: 0,
            scale_factor,
        }
    }

    fn push_batch(&mut self, batch_type: BatchType, count: u32) {
        // Attempt to merge with the last batch if the type is the same
        let merged = match (self.batches.last_mut(), batch_type) {
            (Some(DrawBatch::Circle { count: c, .. }), BatchType::Circle) => { *c += count; true }
            (Some(DrawBatch::Rect { count: c, .. }), BatchType::Rect) => { *c += count; true }
            (Some(DrawBatch::Line { count: c, .. }), BatchType::Line) => { *c += count; true }
            (Some(DrawBatch::Polygon { index_count: c, .. }), BatchType::Polygon) => { *c += count; true }
            (Some(DrawBatch::GradientRect { count: c, .. }), BatchType::GradientRect) => { *c += count; true }
            (Some(DrawBatch::Text { count: c, .. }), BatchType::Text) => { *c += count; true }
            (_, BatchType::PathSimple) => false,
            _ => false,
        };

        if !merged {
            let new_batch = match batch_type {
                BatchType::Circle => DrawBatch::Circle { start: self.current_circle_count.saturating_sub(count), count },
                BatchType::Rect => DrawBatch::Rect { start: self.current_rect_count.saturating_sub(count), count },
                BatchType::Line => DrawBatch::Line { start: self.current_line_count.saturating_sub(count), count },
                BatchType::Polygon => DrawBatch::Polygon { index_start: self.current_polygon_index_count.saturating_sub(count), index_count: count },
                BatchType::GradientRect => DrawBatch::GradientRect { start: self.current_grad_rect_count.saturating_sub(count), count },
                BatchType::PathSimple => DrawBatch::PathSimple {
                    path_idx: (self.pending_path_args.len() as u32).saturating_sub(1),
                    point_count: count,
                },
                // Merge contiguous text glyph instances to minimize hardware draw dispatches
                BatchType::Text => DrawBatch::Text { start: self.current_glyph_count.saturating_sub(count), count },
            };
            self.batches.push(new_batch);
        }
    }

    /// Clears all state, batches, and buffers to prepare for the next frame.
    /// This should be called at the end of each frame's rendering cycle.
    pub fn reset(&mut self) {
        // 1. Reset the batch queue for the new frame
        self.batches.clear();

        // 2. Reset counters used for generating batch start offsets
        self.current_circle_count = 0;
        self.current_rect_count = 0;
        self.current_line_count = 0;
        self.current_polygon_index_count = 0;
        self.current_grad_rect_count = 0;

        // 3. Clear all pending data buffers (CPU side)
        // These are the buffers that accumulate data via draw_* calls
        self.pending_circles.clear();
        self.pending_rects.clear();
        self.pending_lines.clear();
        self.pending_polygon_vertices.clear();
        self.pending_polygon_indices.clear();
        self.pending_glyphs.clear();

        // Assuming you have a corresponding pending buffer for text
        // self.pending_text_vertices.clear();

        // 4. Reset uploaded counters
        // This is crucial! It tells the system that no data has been uploaded to GPU
        // for the new frame yet.
        self.uploaded_circle_count = 0;
        self.uploaded_rect_count = 0;
        self.uploaded_line_count = 0;
        self.uploaded_polygon_index_count = 0;
        self.uploaded_gradient_rect_count = 0;
        self.current_glyph_count = 0;

        // Clear our pure GPU path dynamic streaming queues:
        self.pending_path_points.clear();
        self.pending_path_styles.clear();
        self.pending_path_args.clear();
    }

    // ============================================================================
    // Pipeline Creation Helpers
    // ============================================================================

    /// Creates the circle render pipeline (uses SDF shader for perfect anti-aliasing)
    ///
    /// # Arguments
    /// * `device` - WGPU device handle
    /// * `shader` - Compiled WGSL shader module
    /// * `main_layout` - Main bind group layout (@group(0))
    ///
    /// # Returns
    /// Circle render pipeline
    fn create_circle_pipeline(
        device: &wgpu::Device,
        shader: &wgpu::ShaderModule,
        main_layout: &wgpu::BindGroupLayout,
    ) -> wgpu::RenderPipeline {
        let pipeline_layout = device.create_pipeline_layout(&wgpu::PipelineLayoutDescriptor {
            label: Some("Circle Pipeline Layout"),
            bind_group_layouts: &[Some(main_layout)],
            immediate_size: 0,
        });

        device.create_render_pipeline(&wgpu::RenderPipelineDescriptor {
            label: Some("Circle Render Pipeline"),
            layout: Some(&pipeline_layout),
            vertex: wgpu::VertexState {
                module: shader,
                entry_point: Some("circle_vs"),
                buffers: &[],
                compilation_options: wgpu::PipelineCompilationOptions::default(),
            },
            fragment: Some(wgpu::FragmentState {
                module: shader,
                entry_point: Some("circle_fs"),
                targets: &[Some(wgpu::ColorTargetState {
                    format: wgpu::TextureFormat::Rgba8Unorm,
                    blend: Some(wgpu::BlendState::ALPHA_BLENDING),
                    write_mask: wgpu::ColorWrites::ALL,
                })],
                compilation_options: wgpu::PipelineCompilationOptions::default(),
            }),
            primitive: wgpu::PrimitiveState {
                topology: wgpu::PrimitiveTopology::TriangleStrip,
                strip_index_format: None,
                front_face: wgpu::FrontFace::Ccw,
                cull_mode: None,
                unclipped_depth: false,
                polygon_mode: wgpu::PolygonMode::Fill,
                conservative: false,
            },
            depth_stencil: None,
            multisample: wgpu::MultisampleState::default(),
            multiview_mask: None,
            cache: None,
        })
    }

    /// Creates the line render pipeline and bind group layout
    ///
    /// # Arguments
    /// * `device` - WGPU device handle
    /// * `shader` - Compiled WGSL shader module
    /// * `main_layout` - Main bind group layout (@group(0))
    ///
    /// # Returns
    /// Tuple of (bind group layout, render pipeline)
    fn create_line_pipeline(
        device: &wgpu::Device,
        shader: &wgpu::ShaderModule,
        main_layout: &wgpu::BindGroupLayout,
    ) -> (wgpu::BindGroupLayout, wgpu::RenderPipeline) {
        let line_bind_group_layout =
            device.create_bind_group_layout(&wgpu::BindGroupLayoutDescriptor {
                label: Some("Line Bind Group Layout"),
                entries: &[
                    wgpu::BindGroupLayoutEntry {
                        binding: 0,
                        visibility: wgpu::ShaderStages::VERTEX | wgpu::ShaderStages::FRAGMENT,
                        ty: wgpu::BindingType::Buffer {
                            ty: wgpu::BufferBindingType::Storage { read_only: true },
                            has_dynamic_offset: false,
                            min_binding_size: None,
                        },
                        count: None,
                    },
                    wgpu::BindGroupLayoutEntry {
                        binding: 1,
                        visibility: wgpu::ShaderStages::VERTEX | wgpu::ShaderStages::FRAGMENT,
                        ty: wgpu::BindingType::Buffer {
                            ty: wgpu::BufferBindingType::Uniform,
                            has_dynamic_offset: false,
                            min_binding_size: None,
                        },
                        count: None,
                    },
                ],
            });

        let pipeline_layout = device.create_pipeline_layout(&wgpu::PipelineLayoutDescriptor {
            label: Some("Line Pipeline Layout"),
            bind_group_layouts: &[Some(main_layout)],
            immediate_size: 0,
        });

        let pipeline = device.create_render_pipeline(&wgpu::RenderPipelineDescriptor {
            label: Some("Line Render Pipeline"),
            layout: Some(&pipeline_layout),
            vertex: wgpu::VertexState {
                module: shader,
                entry_point: Some("line_vs"),
                buffers: &[],
                compilation_options: wgpu::PipelineCompilationOptions::default(),
            },
            fragment: Some(wgpu::FragmentState {
                module: shader,
                entry_point: Some("line_fs"),
                targets: &[Some(wgpu::ColorTargetState {
                    format: wgpu::TextureFormat::Rgba8Unorm,
                    blend: Some(wgpu::BlendState::ALPHA_BLENDING),
                    write_mask: wgpu::ColorWrites::ALL,
                })],
                compilation_options: wgpu::PipelineCompilationOptions::default(),
            }),
            primitive: wgpu::PrimitiveState {
                topology: wgpu::PrimitiveTopology::TriangleStrip,
                strip_index_format: None,
                front_face: wgpu::FrontFace::Ccw,
                cull_mode: None,
                unclipped_depth: false,
                polygon_mode: wgpu::PolygonMode::Fill,
                conservative: false,
            },
            depth_stencil: None,
            multisample: wgpu::MultisampleState::default(),
            multiview_mask: None,
            cache: None,
        });

        (line_bind_group_layout, pipeline)
    }

    /// Creates the rectangle render pipeline
    ///
    /// # Arguments
    /// * `device` - WGPU device handle
    /// * `shader` - Compiled WGSL shader module
    /// * `main_layout` - Main bind group layout (@group(0))
    ///
    /// # Returns
    /// Rectangle render pipeline
    fn create_rect_pipeline(
        device: &wgpu::Device,
        shader: &wgpu::ShaderModule,
        main_layout: &wgpu::BindGroupLayout,
    ) -> wgpu::RenderPipeline {
        let pipeline_layout = device.create_pipeline_layout(&wgpu::PipelineLayoutDescriptor {
            label: Some("Rect Pipeline Layout"),
            bind_group_layouts: &[Some(main_layout)],
            immediate_size: 0,
        });

        device.create_render_pipeline(&wgpu::RenderPipelineDescriptor {
            label: Some("Rect Render Pipeline"),
            layout: Some(&pipeline_layout),
            vertex: wgpu::VertexState {
                module: shader,
                entry_point: Some("rect_vs"),
                buffers: &[],
                compilation_options: wgpu::PipelineCompilationOptions::default(),
            },
            fragment: Some(wgpu::FragmentState {
                module: shader,
                entry_point: Some("rect_fs"),
                targets: &[Some(wgpu::ColorTargetState {
                    format: wgpu::TextureFormat::Rgba8Unorm,
                    blend: Some(wgpu::BlendState::ALPHA_BLENDING),
                    write_mask: wgpu::ColorWrites::ALL,
                })],
                compilation_options: wgpu::PipelineCompilationOptions::default(),
            }),
            primitive: wgpu::PrimitiveState {
                topology: wgpu::PrimitiveTopology::TriangleStrip,
                strip_index_format: None,
                front_face: wgpu::FrontFace::Ccw,
                cull_mode: None,
                unclipped_depth: false,
                polygon_mode: wgpu::PolygonMode::Fill,
                conservative: false,
            },
            depth_stencil: None,
            multisample: wgpu::MultisampleState::default(),
            multiview_mask: None,
            cache: None,
        })
    }

    /// Creates the polygon render pipeline (receives CPU-precomputed vertices)
    ///
    /// # Arguments
    /// * `device` - WGPU device handle
    /// * `shader` - Compiled WGSL shader module
    /// * `main_layout` - Main bind group layout (@group(0))
    ///
    /// # Returns
    /// Polygon render pipeline
    fn create_polygon_pipeline(
        device: &wgpu::Device,
        shader: &wgpu::ShaderModule,
        main_layout: &wgpu::BindGroupLayout,
    ) -> wgpu::RenderPipeline {
        let pipeline_layout = device.create_pipeline_layout(&wgpu::PipelineLayoutDescriptor {
            label: Some("Polygon Pipeline Layout"),
            bind_group_layouts: &[Some(main_layout)],
            immediate_size: 0,
        });

        device.create_render_pipeline(&wgpu::RenderPipelineDescriptor {
            label: Some("Polygon Render Pipeline"),
            layout: Some(&pipeline_layout),
            vertex: wgpu::VertexState {
                module: shader,
                entry_point: Some("polygon_vs"),
                buffers: &[PathVertex::DESC],
                compilation_options: wgpu::PipelineCompilationOptions::default(),
            },
            fragment: Some(wgpu::FragmentState {
                module: shader,
                entry_point: Some("polygon_fs"),
                targets: &[Some(wgpu::ColorTargetState {
                    format: wgpu::TextureFormat::Rgba8Unorm,
                    blend: Some(wgpu::BlendState::ALPHA_BLENDING),
                    write_mask: wgpu::ColorWrites::ALL,
                })],
                compilation_options: wgpu::PipelineCompilationOptions::default(),
            }),
            primitive: wgpu::PrimitiveState {
                topology: wgpu::PrimitiveTopology::TriangleList,
                strip_index_format: None,
                front_face: wgpu::FrontFace::Ccw,
                cull_mode: None,
                unclipped_depth: false,
                polygon_mode: wgpu::PolygonMode::Fill,
                conservative: false,
            },
            depth_stencil: None,
            multisample: wgpu::MultisampleState::default(),
            multiview_mask: None,
            cache: None,
        })
    }

    /// Creates the gradient rectangle render pipeline and bind group layout
    ///
    /// # Arguments
    /// * `device` - WGPU device handle
    /// * `shader` - Compiled WGSL shader module
    /// * `main_layout` - Main bind group layout (@group(0))
    ///
    /// # Returns
    /// Tuple of (bind group layout, render pipeline)
    fn create_gradient_rect_pipeline(
        device: &wgpu::Device,
        shader: &wgpu::ShaderModule,
        main_layout: &wgpu::BindGroupLayout,
    ) -> (wgpu::BindGroupLayout, wgpu::RenderPipeline) {
        let gradient_rect_bind_group_layout =
            device.create_bind_group_layout(&wgpu::BindGroupLayoutDescriptor {
                label: Some("Gradient Rect Bind Group Layout"),
                entries: &[
                    wgpu::BindGroupLayoutEntry {
                        binding: 0,
                        visibility: wgpu::ShaderStages::VERTEX | wgpu::ShaderStages::FRAGMENT,
                        ty: wgpu::BindingType::Buffer {
                            ty: wgpu::BufferBindingType::Storage { read_only: true },
                            has_dynamic_offset: false,
                            min_binding_size: None,
                        },
                        count: None,
                    },
                    wgpu::BindGroupLayoutEntry {
                        binding: 1,
                        visibility: wgpu::ShaderStages::VERTEX | wgpu::ShaderStages::FRAGMENT,
                        ty: wgpu::BindingType::Buffer {
                            ty: wgpu::BufferBindingType::Uniform,
                            has_dynamic_offset: false,
                            min_binding_size: None,
                        },
                        count: None,
                    },
                ],
            });

        let pipeline_layout = device.create_pipeline_layout(&wgpu::PipelineLayoutDescriptor {
            label: Some("Gradient Rect Pipeline Layout"),
            bind_group_layouts: &[Some(main_layout)],
            immediate_size: 0,
        });

        let pipeline = device.create_render_pipeline(&wgpu::RenderPipelineDescriptor {
            label: Some("Gradient Rect Render Pipeline"),
            layout: Some(&pipeline_layout),
            vertex: wgpu::VertexState {
                module: shader,
                entry_point: Some("grad_rect_vs"),
                buffers: &[],
                compilation_options: wgpu::PipelineCompilationOptions::default(),
            },
            fragment: Some(wgpu::FragmentState {
                module: shader,
                entry_point: Some("grad_rect_fs"),
                targets: &[Some(wgpu::ColorTargetState {
                    format: wgpu::TextureFormat::Rgba8Unorm,
                    blend: Some(wgpu::BlendState::ALPHA_BLENDING),
                    write_mask: wgpu::ColorWrites::ALL,
                })],
                compilation_options: wgpu::PipelineCompilationOptions::default(),
            }),
            primitive: wgpu::PrimitiveState {
                topology: wgpu::PrimitiveTopology::TriangleStrip,
                strip_index_format: None,
                front_face: wgpu::FrontFace::Ccw,
                cull_mode: None,
                unclipped_depth: false,
                polygon_mode: wgpu::PolygonMode::Fill,
                conservative: false,
            },
            depth_stencil: None,
            multisample: wgpu::MultisampleState::default(),
            multiview_mask: None,
            cache: None,
        });

        (gradient_rect_bind_group_layout, pipeline)
    }

    // ============================================================================
    // Text Pipeline & Resource Initialization
    // ============================================================================
    /// 🌟 一体化创建文字系统所需的全部管线、布局、纹理图集与采样器
    /// 移除了旧版的 async 异步修饰,完美契合标准同步初始化流
    pub fn create_text_pipeline(
        device: &wgpu::Device,
        main_layout: &wgpu::BindGroupLayout,
    ) -> (
        wgpu::BindGroupLayout, // _text_bg_layout_real
        wgpu::RenderPipeline,  // text_pipeline
        wgpu::Texture,         // text_atlas_texture
        wgpu::TextureView,     // text_atlas_view
        wgpu::Sampler,         // text_atlas_sampler
    ) {
        // --------------------------------------------------------------------
        // A. 定义文字本地绑定组布局 (@group(1))
        // --------------------------------------------------------------------
        let text_bind_group_layout = device.create_bind_group_layout(&wgpu::BindGroupLayoutDescriptor {
            label: Some("Text Local Bind Group Layout (@group(1))"),
            entries: &[
                // Binding 0: 动态字形实例 Storage Buffer
                wgpu::BindGroupLayoutEntry {
                    binding: 0,
                    visibility: wgpu::ShaderStages::VERTEX,
                    ty: wgpu::BindingType::Buffer {
                        ty: wgpu::BufferBindingType::Storage { read_only: true },
                        has_dynamic_offset: false,
                        min_binding_size: None,
                    },
                    count: None,
                },
                // Binding 1: 字体大图集单通道 Alpha 遮罩纹理
                wgpu::BindGroupLayoutEntry {
                    binding: 1,
                    visibility: wgpu::ShaderStages::FRAGMENT,
                    ty: wgpu::BindingType::Texture {
                        multisampled: false,
                        view_dimension: wgpu::TextureViewDimension::D2,
                        sample_type: wgpu::TextureSampleType::Float { filterable: true },
                    },
                    count: None,
                },
                // Binding 2: 线性纹理采样器
                wgpu::BindGroupLayoutEntry {
                    binding: 2,
                    visibility: wgpu::ShaderStages::FRAGMENT,
                    ty: wgpu::BindingType::Sampler(wgpu::SamplerBindingType::Filtering),
                    count: None,
                },
            ],
        });

        // --------------------------------------------------------------------
        // B. 内联现代高 throughput 的 WGSL 文本着色器
        // --------------------------------------------------------------------
        let text_wgsl = r#"
            struct EngineGlobalData {
                screen_width: f32,
                screen_height: f32,
                scale_factor: f32,
                _padding: f32,
            };

            struct GpuGlyph {
                x: f32,
                y: f32,
                width: f32,
                height: f32,
                uv_min_x: f32,
                uv_min_y: f32,
                uv_max_x: f32,
                uv_max_y: f32,
                r: f32,
                g: f32,
                b: f32,
                a: f32,
                o_x: f32,
                o_y: f32,
                _pad0: f32,
                _pad1: f32,
            };

            struct TextInstances {
                glyphs: array<GpuGlyph>,
            };

            @group(0) @binding(5) var<uniform> global_data: EngineGlobalData;
            
            @group(1) @binding(0) var<storage, read> instance_data: TextInstances;
            @group(1) @binding(1) var t_atlas: texture_2d<f32>;
            @group(1) @binding(2) var s_atlas: sampler;

            struct VertexOutput {
                @builtin(position) clip_position: vec4<f32>,
                @location(0) uv: vec2<f32>,
                @location(1) color: vec4<f32>,
            };

            @vertex
            fn text_vs(
                @builtin(vertex_index) v_idx: u32,
                @builtin(instance_index) i_idx: u32,
            ) -> VertexOutput {
                let glyph = instance_data.glyphs[i_idx];
                var local_pos = vec2<f32>(0.0, 0.0);
                var uv = vec2<f32>(0.0, 0.0);

                // Interpret glyph.x,y as baseline pen origin. Convert to top-left by adding bearing offsets.
                let top_left = vec2<f32>(glyph.x + glyph.o_x, glyph.y + glyph.o_y);
                let corners = array<vec2<f32>, 6>(
                    vec2<f32>(0.0, 0.0),
                    vec2<f32>(0.0, 1.0),
                    vec2<f32>(1.0, 0.0),
                    vec2<f32>(0.0, 0.0),
                    vec2<f32>(1.0, 0.0),
                    vec2<f32>(1.0, 1.0),
                );
                let uvs = array<vec2<f32>, 6>(
                    vec2<f32>(glyph.uv_min_x, glyph.uv_min_y),
                    vec2<f32>(glyph.uv_min_x, glyph.uv_max_y),
                    vec2<f32>(glyph.uv_max_x, glyph.uv_min_y),
                    vec2<f32>(glyph.uv_min_x, glyph.uv_min_y),
                    vec2<f32>(glyph.uv_max_x, glyph.uv_min_y),
                    vec2<f32>(glyph.uv_max_x, glyph.uv_max_y),
                );
                local_pos = top_left + vec2<f32>(corners[v_idx].x * glyph.width, corners[v_idx].y * glyph.height);
                uv = uvs[v_idx];

                // Convert logical local_pos into physical pixels using scale_factor
                let final_pos = local_pos * global_data.scale_factor;
                let sw = global_data.screen_width * global_data.scale_factor;
                let sh = global_data.screen_height * global_data.scale_factor;
                let ndc_x = (final_pos.x / sw) * 2.0 - 1.0;
                let ndc_y = 1.0 - (final_pos.y / sh) * 2.0;

                var out: VertexOutput;
                out.clip_position = vec4<f32>(ndc_x, ndc_y, 0.0, 1.0);
                out.uv = uv;
                out.color = vec4<f32>(glyph.r, glyph.g, glyph.b, glyph.a);
                return out;
            }

            @fragment
            fn text_fs(in: VertexOutput) -> @location(0) vec4<f32> {
                let alpha_mask = textureSample(t_atlas, s_atlas, in.uv).r;
                return vec4<f32>(in.color.rgb, in.color.a * alpha_mask);
            }
        "#;

        let shader_module = device.create_shader_module(wgpu::ShaderModuleDescriptor {
            label: Some("Charton Text WGSL Module"),
            source: wgpu::ShaderSource::Wgsl(text_wgsl.into()),
        });

        // --------------------------------------------------------------------
        // C. 组装多图层渲染管线布局 (Layout)
        // --------------------------------------------------------------------
        let pipeline_layout = device.create_pipeline_layout(&wgpu::PipelineLayoutDescriptor {
            label: Some("Text Pipeline Layout"),
            bind_group_layouts: &[Some(main_layout), Some(&text_bind_group_layout)],
            immediate_size: 0,
        });

        // --------------------------------------------------------------------
        // D. 建立渲染状态机管线 (Pipeline)
        // --------------------------------------------------------------------
        let text_pipeline = device.create_render_pipeline(&wgpu::RenderPipelineDescriptor {
            label: Some("Text Render Pipeline"),
            layout: Some(&pipeline_layout),
            vertex: wgpu::VertexState {
                module: &shader_module,
                entry_point: Some("text_vs"),
                buffers: &[], // 现代 Instancing 架构空出传统顶点缓冲区
                compilation_options: Default::default(),
            },
            fragment: Some(wgpu::FragmentState {
                module: &shader_module,
                entry_point: Some("text_fs"),
                targets: &[Some(wgpu::ColorTargetState {
                    format: wgpu::TextureFormat::Rgba8Unorm, // 根据你的实际表面调整格式
                    blend: Some(wgpu::BlendState::ALPHA_BLENDING), // 开启文本透明度混合
                    write_mask: wgpu::ColorWrites::ALL,
                })],
                compilation_options: Default::default(),
            }),
            primitive: wgpu::PrimitiveState {
                topology: wgpu::PrimitiveTopology::TriangleList,
                ..Default::default()
            },
            depth_stencil: None,
            multisample: wgpu::MultisampleState::default(),
            multiview_mask: None,
            cache: None,
        });

        // --------------------------------------------------------------------
        // E. 统一分配 1024x1024 物理动态字形图集资源 (Atlas)
        // --------------------------------------------------------------------
        let atlas_width = 2048;
        let atlas_height = 2048;

        let text_atlas_texture = device.create_texture(&wgpu::TextureDescriptor {
            label: Some("Charton Monolithic Font Atlas Texture"),
            size: wgpu::Extent3d {
                width: atlas_width,
                height: atlas_height,
                depth_or_array_layers: 1,
            },
            mip_level_count: 1,
            sample_count: 1,
            dimension: wgpu::TextureDimension::D2,
            format: wgpu::TextureFormat::R8Unorm,
            usage: wgpu::TextureUsages::TEXTURE_BINDING | wgpu::TextureUsages::COPY_DST,
            view_formats: &[],
        });

        // 💡 完美的、不易报错的通用现代写法
        let text_atlas_view = text_atlas_texture.create_view(&wgpu::TextureViewDescriptor {
            label: Some("Text Font Atlas Texture View"),
            format: Some(wgpu::TextureFormat::R8Unorm),
            dimension: Some(wgpu::TextureViewDimension::D2),
            usage: Default::default(), // 👈 用 Default 让编译器根据上下文自动推导正确的枚举值!
            aspect: wgpu::TextureAspect::All,
            base_mip_level: 0,
            mip_level_count: None,
            base_array_layer: 0,
            array_layer_count: None,
        });

        // 💡 完美对齐后的采样器(使用你已经修正好的 MipmapFilterMode)
        let text_atlas_sampler = device.create_sampler(&wgpu::SamplerDescriptor {
            label: Some("Text Font Atlas Nearest Sampler"),
            address_mode_u: wgpu::AddressMode::ClampToEdge,
            address_mode_v: wgpu::AddressMode::ClampToEdge,
            address_mode_w: wgpu::AddressMode::ClampToEdge,
            mag_filter: wgpu::FilterMode::Nearest,
            min_filter: wgpu::FilterMode::Nearest,
            mipmap_filter: wgpu::MipmapFilterMode::Nearest,
            ..Default::default()
        });

        // 完美返回外部所需的 5 个资源元组(注意:去掉了原先没有意义的 6 个解构残留)
        (
            text_bind_group_layout,
            text_pipeline,
            text_atlas_texture,
            text_atlas_view,
            text_atlas_sampler,
        )
    }

    // ============================================================================
    // Buffer Creation Helpers
    // ============================================================================

    /// Creates a dummy buffer with zero-initialized data (for initial bind group setup)
    ///
    /// # Arguments
    /// * `device` - WGPU device handle
    ///
    /// # Returns
    /// Zero-initialized buffer of type T
    fn create_dummy_buffer<T: Pod>(device: &wgpu::Device) -> wgpu::Buffer {
        device.create_buffer_init(&wgpu::util::BufferInitDescriptor {
            label: Some(format!("Dummy {} Buffer", std::any::type_name::<T>()).as_str()),
            contents: bytemuck::cast_slice(&[T::zeroed()]),
            usage: wgpu::BufferUsages::STORAGE
                | wgpu::BufferUsages::COPY_DST
                | wgpu::BufferUsages::VERTEX
                | wgpu::BufferUsages::INDEX,
        })
    }

    /// Creates a GPU buffer from a slice of POD (Plain Old Data) values
    ///
    /// # Arguments
    /// * `data` - Slice of POD values to copy to GPU
    ///
    /// # Returns
    /// WGPU buffer containing the copied data
    fn create_buffer<T: Pod>(&self, data: &[T]) -> wgpu::Buffer {
        self.device
            .create_buffer_init(&wgpu::util::BufferInitDescriptor {
                label: Some(format!("Updated {} Buffer", std::any::type_name::<T>()).as_str()),
                contents: bytemuck::cast_slice(data),
                usage: wgpu::BufferUsages::STORAGE
                    | wgpu::BufferUsages::COPY_DST
                    | wgpu::BufferUsages::VERTEX
                    | wgpu::BufferUsages::INDEX,
            })
    }

    /// Injects a raw continuous path into the high-throughput GPU streaming queues.
    /// This removes the legacy CPU-bound Lyon tessellator entirely and schedules a
    /// vertex-buffer-less hardware line expansion on the GPU.
    pub fn tessellate_path(&mut self, config: PathConfig) {
        // A valid polyline segment requires at least a starting point and a destination
        if config.points.len() < 2 {
            return;
        }

        // Record lookup offsets inside the global contiguous arrays
        let start_point_idx = self.pending_path_points.len() as u32;
        let point_count = config.points.len() as u32;
        let style_idx = self.pending_path_styles.len() as u32;
        let path_idx = self.pending_path_args.len() as u32;

        // 1. Stream raw (x, y) coordinates into the global points pool
        for &(x, y) in &config.points {
            self.pending_path_points.push(GpuPathPoint { x, y });
        }

        // 2. Format and push the styling configurations with opacity multiplier
        let stroke_color = config.stroke.rgba();
        self.pending_path_styles.push(GpuPathStyle {
            r: stroke_color[0],
            g: stroke_color[1],
            b: stroke_color[2],
            a: stroke_color[3] * config.opacity,
            thickness: config.stroke_width,
            _pad0: 0.0, // Satisfy 16-byte layout alignment boundaries explicitly
            _pad1: 0.0,
            _pad2: 0.0,
        });

        // 3. Compress layout parameters into the global routing lookup map
        self.pending_path_args.push(GpuPathArgs {
            start_point_idx,
            style_idx,
            _pad0: 0, // Satisfy structural padding constraints
            _pad1: 0,
        });

        // 4. Commit a zero-alignment-overhead batch token into the deferred render queue
        self.batches.push(DrawBatch::PathSimple {
            path_idx,
            point_count,
        });
    }

    // ============================================================================
    // Render & Flush
    // ============================================================================
    /// Flushes pending render data to GPU buffers and renders to the target texture view
    pub fn flush_and_render(&mut self, view: &wgpu::TextureView) {
        // --------------------------------------------------------------------
        // PHASE 1: DATA UPLOAD
        // --------------------------------------------------------------------
        if !self.pending_circles.is_empty() {
            let circles = std::mem::take(&mut self.pending_circles);
            self.circle_buffer = self.create_buffer(&circles);
            self.uploaded_circle_count = circles.len() as u32;
        }

        if !self.pending_rects.is_empty() {
            let rects = std::mem::take(&mut self.pending_rects);
            self.rect_buffer = self.create_buffer(&rects);
            self.uploaded_rect_count = rects.len() as u32;
        }

        if !self.pending_lines.is_empty() {
            let lines = std::mem::take(&mut self.pending_lines);
            self.line_buffer = self.create_buffer(&lines);
            self.uploaded_line_count = lines.len() as u32;
        }

        if !self.pending_polygon_vertices.is_empty() || !self.pending_polygon_indices.is_empty() {
            let vertices = std::mem::take(&mut self.pending_polygon_vertices);
            let indices = std::mem::take(&mut self.pending_polygon_indices);
            self.polygon_vertex_buffer = self.create_buffer(&vertices);
            self.polygon_index_buffer = self.create_buffer(&indices);
            self.uploaded_polygon_index_count = indices.len() as u32;
        }

        let has_paths = !self.pending_path_points.is_empty();
        let path_bind_group = if has_paths {
            use wgpu::util::DeviceExt;
            let points_buf = self.device.create_buffer_init(&wgpu::util::BufferInitDescriptor {
                label: Some("Path Points Global Storage Buffer"),
                contents: bytemuck::cast_slice(&self.pending_path_points),
                usage: wgpu::BufferUsages::STORAGE | wgpu::BufferUsages::COPY_DST,
            });
            let styles_buf = self.device.create_buffer_init(&wgpu::util::BufferInitDescriptor {
                label: Some("Path Styles Global Storage Buffer"),
                contents: bytemuck::cast_slice(&self.pending_path_styles),
                usage: wgpu::BufferUsages::STORAGE | wgpu::BufferUsages::COPY_DST,
            });
            let args_buf = self.device.create_buffer_init(&wgpu::util::BufferInitDescriptor {
                label: Some("Path Routing Args Global Storage Buffer"),
                contents: bytemuck::cast_slice(&self.pending_path_args),
                usage: wgpu::BufferUsages::STORAGE | wgpu::BufferUsages::COPY_DST,
            });

            Some(self.device.create_bind_group(&wgpu::BindGroupDescriptor {
                label: Some("Global Monolithic Path Bind Group 1"),
                layout: &self.path_bind_group_layout,
                entries: &[
                    wgpu::BindGroupEntry { binding: 0, resource: wgpu::BindingResource::Buffer(points_buf.as_entire_buffer_binding()) },
                    wgpu::BindGroupEntry { binding: 1, resource: wgpu::BindingResource::Buffer(styles_buf.as_entire_buffer_binding()) },
                    wgpu::BindGroupEntry { binding: 2, resource: wgpu::BindingResource::Buffer(args_buf.as_entire_buffer_binding()) },
                ],
            }))
        } else {
            None
        };

        if !self.pending_gradient_rects.is_empty() {
            let grad_rects = std::mem::take(&mut self.pending_gradient_rects);
            self.gradient_rect_buffer = self.create_buffer(&grad_rects);
            self.uploaded_gradient_rect_count = grad_rects.len() as u32;
        }

        // 🌟 [文本流数据实时上树上传]
        if !self.pending_glyphs.is_empty() {
            self.queue.write_buffer(
                &self.text_instance_buffer,
                0,
                bytemuck::cast_slice(&self.pending_glyphs),
            );
        }

        // --------------------------------------------------------------------
        // PHASE 2: BIND GROUP SETUP
        // --------------------------------------------------------------------
        self.main_bind_group = self.device.create_bind_group(&wgpu::BindGroupDescriptor {
            label: Some("Main Bind Group (Updated)"),
            layout: &self.main_bind_group_layout,
            entries: &[
                wgpu::BindGroupEntry { binding: 0, resource: wgpu::BindingResource::Buffer(self.circle_buffer.as_entire_buffer_binding()) },
                wgpu::BindGroupEntry { binding: 1, resource: wgpu::BindingResource::Buffer(self.rect_buffer.as_entire_buffer_binding()) },
                wgpu::BindGroupEntry { binding: 2, resource: wgpu::BindingResource::Buffer(self.line_buffer.as_entire_buffer_binding()) },
                wgpu::BindGroupEntry { binding: 4, resource: wgpu::BindingResource::Buffer(self.gradient_rect_buffer.as_entire_buffer_binding()) },
                wgpu::BindGroupEntry { binding: 5, resource: wgpu::BindingResource::Buffer(self.uniform_buffer.as_entire_buffer_binding()) },
            ],
        });

        let render_pass_desc = wgpu::RenderPassDescriptor {
            label: Some("Main Render Pass"),
            color_attachments: &[Some(wgpu::RenderPassColorAttachment {
                view,
                resolve_target: None,
                ops: wgpu::Operations {
                    load: wgpu::LoadOp::Clear(wgpu::Color::TRANSPARENT),
                    store: wgpu::StoreOp::Store,
                },
                depth_slice: None,
            })],
            depth_stencil_attachment: None,
            occlusion_query_set: None,
            timestamp_writes: None,
            multiview_mask: None,
        };

        let mut encoder = self.device.create_command_encoder(&wgpu::CommandEncoderDescriptor {
            label: Some("Render Encoder"),
        });

        // --------------------------------------------------------------------
        // PHASE 3: ORCHESTRATED DRAWING
        // --------------------------------------------------------------------
        {
            let mut pass = encoder.begin_render_pass(&render_pass_desc);
            pass.set_bind_group(0, Some(&self.main_bind_group), &[]);

            for batch in &self.batches {
                match batch {
                    DrawBatch::Circle { start, count } => {
                        pass.set_pipeline(&self.circle_pipeline);
                        pass.draw(0..4, *start..(*start + *count));
                    }
                    DrawBatch::Rect { start, count } => {
                        pass.set_pipeline(&self.rect_pipeline);
                        pass.draw(0..4, *start..(*start + *count));
                    }
                    DrawBatch::Line { start, count } => {
                        pass.set_pipeline(&self.line_pipeline);
                        pass.draw(0..4, *start..(*start + *count));
                    }
                    DrawBatch::Polygon { index_start, index_count } => {
                        pass.set_pipeline(&self.polygon_pipeline);
                        pass.set_vertex_buffer(0, self.polygon_vertex_buffer.slice(..));
                        pass.set_index_buffer(self.polygon_index_buffer.slice(..), wgpu::IndexFormat::Uint16);
                        pass.draw_indexed(*index_start..(*index_start + *index_count), 0, 0..1);
                    }
                    DrawBatch::GradientRect { start, count } => {
                        pass.set_pipeline(&self.gradient_rect_pipeline);
                        pass.draw(0..4, *start..(*start + *count));
                    }
                    DrawBatch::PathSimple { path_idx, point_count } => {
                        if let Some(global_path_bg) = &path_bind_group {
                            pass.set_pipeline(&self.path_simple_pipeline);
                            pass.set_bind_group(1, Some(global_path_bg), &[]);
                            let virtual_vertex_count = (*point_count - 1) * 6;
                            pass.draw(0..virtual_vertex_count, *path_idx..(*path_idx + 1));
                        }
                    }
                    DrawBatch::Text { start, count } => {
                        let glyph_count = *count as u32;
                        if glyph_count > 0 {
                            let text_bind_group = self.device.create_bind_group(&wgpu::BindGroupDescriptor {
                                label: Some("Dynamic Local Text Batch Bind Group (@group(1))"),
                                layout: &self.text_bind_group_layout,
                                entries: &[
                                    wgpu::BindGroupEntry {
                                        binding: 0,
                                        resource: wgpu::BindingResource::Buffer(self.text_instance_buffer.as_entire_buffer_binding()),
                                    },
                                    wgpu::BindGroupEntry {
                                        binding: 1,
                                        resource: wgpu::BindingResource::TextureView(&self.text_atlas_view),
                                    },
                                    wgpu::BindGroupEntry {
                                        binding: 2,
                                        resource: wgpu::BindingResource::Sampler(&self.text_atlas_sampler),
                                    },
                                ],
                            });

                            pass.set_pipeline(&self.text_pipeline);
                            pass.set_bind_group(1, Some(&text_bind_group), &[]);
                            pass.draw(0..6, *start..(*start + glyph_count));
                        }
                    }
                }
            }
        }

        self.queue.submit(Some(encoder.finish()));
        self.reset();
    }
}

// ============================================================================
// RenderBackend Trait Implementation
// ============================================================================

impl RenderBackend for WgpuBackend {
    fn draw_circle(&mut self, config: CircleConfig) {
        let fill = config.fill.rgba();
        let point = GpuPoint {
            x: config.x,
            y: config.y,
            r: fill[0],
            g: fill[1],
            b: fill[2],
            a: fill[3] * config.opacity,
            radius: config.radius,
        };

        // 1. Store the circle data into the CPU-side pending buffer
        self.pending_circles.push(point);

        // 2. Increment the counter (used as a reference for calculating the start offset in batching)
        self.current_circle_count += 1;

        // 3. Register the draw command in the batch queue (entry point for batching)
        self.push_batch(BatchType::Circle, 1);
    }

    fn draw_rect(&mut self, config: RectConfig) {
        let fill = config.fill.rgba();
        let rect = GpuRect {
            x: config.x,
            y: config.y,
            width: config.width,
            height: config.height,
            r: fill[0],
            g: fill[1],
            b: fill[2],
            a: fill[3] * config.opacity,
            corner_radius: 0.0,
        };

        // 1. Store the rect data into the CPU-side pending buffer
        self.pending_rects.push(rect);

        // 2. Increment the rect counter (used as a reference for calculating the start offset)
        self.current_rect_count += 1;

        // 3. Register the draw command in the batch queue
        self.push_batch(BatchType::Rect, 1);
    }

    fn draw_line(&mut self, config: LineConfig) {
        let color = config.color.rgba();
        let line = GpuLine {
            x1: config.x1,
            y1: config.y1,
            x2: config.x2,
            y2: config.y2,
            r: color[0],
            g: color[1],
            b: color[2],
            a: color[3] * config.opacity,
            width: config.width,
            _pad1: 0.0,
            _pad2: 0.0,
            _pad3: 0.0,
        };

        // 1. Store the line data into the CPU-side pending buffer
        self.pending_lines.push(line);

        // 2. Increment the line counter (used as a reference for calculating the start offset)
        self.current_line_count += 1;

        // 3. Register the draw command in the batch queue
        self.push_batch(BatchType::Line, 1);
    }

    // ------------------------------
    // Direct vertex shapes (NO TESSELLATE)
    // ------------------------------
    /// Renders a polygon using PRE-COMPUTED vertices directly (no tessellation needed).
    /// This is the most efficient path for regular shapes (triangle, diamond, star, hexagon, etc.)
    /// that are already generated upstream in the PointRenderer.
    ///
    /// - Skips expensive path tessellation/geometry subdivision
    /// - Directly uploads vertices to GPU for maximum performance
    /// - Matches SVG/PNG backend behavior 1:1
    /// - Uses simple triangle fan for convex polygons & stars
    fn draw_polygon(&mut self, config: PolygonConfig) {
        // A valid polygon requires at least 3 vertices. Early exit if invalid.
        if config.points.len() < 3 {
            return;
        }

        // Resolve fill color and apply opacity modulation
        let fill = config.fill.rgba();
        let color = [fill[0], fill[1], fill[2], fill[3] * config.fill_opacity];

        // Triangle fan rendering: use the FIRST vertex as the common origin/fan center
        let base_vertex = self.pending_polygon_vertices.len() as u16;
        let point_count = config.points.len();

        for &(x, y) in &config.points {
            self.pending_polygon_vertices.push(PathVertex {
                position: [x as f32, y as f32],
                color,
                is_fill: 1.0,
            });
        }

        // Generate triangle fan indices
        let mut indices = Vec::new();
        for i in 1..point_count - 1 {
            indices.extend([
                base_vertex,
                base_vertex + i as u16,
                base_vertex + (i + 1) as u16,
            ]);
        }

        // 1. Append finalized indices to pending render buffers
        let index_count = indices.len() as u32;
        self.pending_polygon_indices.extend(indices);

        // 2. Update the counter (track total indices uploaded)
        self.current_polygon_index_count += index_count;

        // 3. Register the polygon batch
        self.push_batch(BatchType::Polygon, index_count);
    }

    fn draw_gradient_rect(&mut self, config: GradientRectConfig) {
        let (start_color, end_color) = match config.stops.as_slice() {
            [] => (SingleColor::none(), SingleColor::none()),
            [(_, color)] => (color.clone(), color.clone()),
            _ => (
                config.stops.first().unwrap().1.clone(),
                config.stops.last().unwrap().1.clone(),
            ),
        };

        let start_rgba = start_color.rgba();
        let end_rgba = end_color.rgba();
        let grad_rect = GpuGradientRect {
            x: config.x,
            y: config.y,
            width: config.width,
            height: config.height,
            start_r: start_rgba[0],
            start_g: start_rgba[1],
            start_b: start_rgba[2],
            start_a: start_rgba[3],
            end_r: end_rgba[0],
            end_g: end_rgba[1],
            end_b: end_rgba[2],
            end_a: end_rgba[3],
            angle: if config.is_vertical {
                std::f32::consts::FRAC_PI_2
            } else {
                0.0
            },
            opacity: 1.0,
        };

        // 1. Store the gradient rect data into the CPU-side pending buffer
        self.pending_gradient_rects.push(grad_rect);

        // 2. Increment the gradient rect counter
        self.current_grad_rect_count += 1;

        // 3. Register the draw command in the batch queue
        self.push_batch(BatchType::GradientRect, 1);
    }

    fn draw_path(&mut self, config: PathConfig) {
        self.tessellate_path(config);
    }

    fn draw_text(&mut self, config: TextConfig) {
        // 1. Rasterize at device pixels (HiDPI aware) then expose logical coordinates to shader
        let font_size = config.font_size;
        let scale_px = ab_glyph::PxScale::from(font_size * self.scale_factor);
        let scaled_font_px = self.font.as_scaled(scale_px);

        // Compute total width in logical units (divide physical advances by scale_factor)
        let mut total_width = 0.0f32;
        let mut last_glyph_id = None;
        const TRACKING: f32 = 0.3;

        let mut text_top = f32::INFINITY;
        let mut text_bottom = f32::NEG_INFINITY;
        let mut text_glyphs: Vec<(ab_glyph::GlyphId, CachedGlyphInfo)> = Vec::new();

        for ch in config.text.chars() {
            if ch.is_control() { continue; }
            let gid = self.font.glyph_id(ch);

            if let Some(prev) = last_glyph_id {
                total_width += scaled_font_px.kern(prev, gid) / self.scale_factor;
            }

            let cached_glyph = if let Some(info) = self.font_cache.get(&(ch, (font_size * self.scale_factor) as u32)) {
                info.clone()
            } else {
                let gid = self.font.glyph_id(ch);
                let glyph = gid.with_scale(scale_px);
                let cached = if let Some(outlined) = self.font.outline_glyph(glyph) {
                    let bounds = outlined.px_bounds();
                    let mut width = bounds.width().ceil() as u32;
                    let mut height = bounds.height().ceil() as u32;
                    if width == 0 { width = 1; }
                    if height == 0 { height = 1; }

                    const ATLAS_WIDTH: u32 = 2048;
                    const ATLAS_HEIGHT: u32 = 2048;

                    if self.atlas_current_x + width + 4 > ATLAS_WIDTH {
                        self.atlas_current_y += self.atlas_max_row_height + 4;
                        self.atlas_current_x = 4;
                        self.atlas_max_row_height = 0;
                    }
                    if self.atlas_current_y + height + 4 > ATLAS_HEIGHT {
                        eprintln!("[WARN] GPU Font Atlas Cache full! Skipping glyph rasterization.");
                        CachedGlyphInfo { uv_min: [0.0, 0.0], uv_max: [0.0, 0.0], width: 0.0, height: 0.0, o_x: 0.0, o_y: 0.0 }
                    } else {
                        let mut alpha_pixels = vec![0u8; (width * height) as usize];
                        outlined.draw(|x, y, alpha| {
                            let idx = (y * width + x) as usize;
                            if idx < alpha_pixels.len() {
                                let alpha = alpha.clamp(0.0, 1.0);
                                alpha_pixels[idx] = (alpha * 255.0).round() as u8;
                            }
                        });

                        let bytes_per_row = ((width + 255) / 256) * 256;
                        let mut padded_pixels = vec![0u8; (bytes_per_row * height) as usize];
                        for row in 0..height as usize {
                            let src_start = row * width as usize;
                            let dst_start = row * bytes_per_row as usize;
                            padded_pixels[dst_start..dst_start + width as usize].copy_from_slice(
                                &alpha_pixels[src_start..src_start + width as usize],
                            );
                        }

                        self.queue.write_texture(
                            wgpu::TexelCopyTextureInfo {
                                texture: &self.text_atlas_texture,
                                mip_level: 0,
                                origin: wgpu::Origin3d { x: self.atlas_current_x, y: self.atlas_current_y, z: 0 },
                                aspect: wgpu::TextureAspect::All,
                            },
                            &padded_pixels,
                            wgpu::TexelCopyBufferLayout {
                                offset: 0,
                                bytes_per_row: Some(bytes_per_row),
                                rows_per_image: Some(height),
                            },
                            wgpu::Extent3d { width, height, depth_or_array_layers: 1 },
                        );

                        let atlas_w = 2048.0f32;
                        let atlas_h = 2048.0f32;
                        let uv_min = [
                            (self.atlas_current_x as f32 + 0.5) / atlas_w,
                            (self.atlas_current_y as f32 + 0.5) / atlas_h,
                        ];
                        let uv_max = [
                            (self.atlas_current_x as f32 + (width as f32) - 0.5) / atlas_w,
                            (self.atlas_current_y as f32 + (height as f32) - 0.5) / atlas_h,
                        ];

                        let info = CachedGlyphInfo {
                            uv_min,
                            uv_max,
                            width: width as f32,
                            height: height as f32,
                            o_x: bounds.min.x,
                            o_y: bounds.min.y,
                        };

                        self.font_cache.insert((ch, (font_size * self.scale_factor) as u32), info.clone());
                        self.atlas_current_x += width + 4;
                        if height > self.atlas_max_row_height {
                            self.atlas_max_row_height = height;
                        }
                        info
                    }
                } else {
                    CachedGlyphInfo { uv_min: [0.0, 0.0], uv_max: [0.0, 0.0], width: 0.0, height: 0.0, o_x: 0.0, o_y: 0.0 }
                };
                cached
            };

            let top = cached_glyph.o_y;
            let bottom = cached_glyph.o_y + cached_glyph.height;
            text_top = text_top.min(top);
            text_bottom = text_bottom.max(bottom);
            text_glyphs.push((gid, cached_glyph.clone()));

            total_width += scaled_font_px.h_advance(gid) / self.scale_factor + TRACKING;
            last_glyph_id = Some(gid);
        }

        if total_width > 0.0 {
            total_width -= TRACKING;
        }

        let mut dx = 0.0f32;
        match config.text_anchor.as_str() {
            "middle" => dx -= total_width / 2.0,
            "end" => dx -= total_width,
            _ => {}
        }

        let mut dy = 0.0f32;
        let ascent = scaled_font_px.ascent() / self.scale_factor;
        let descent = scaled_font_px.descent() / self.scale_factor;
        match config.dominant_baseline.as_str() {
            "hanging" => dy += ascent,
            "central" | "middle" => {
                dy += (ascent + descent) / 2.0;
            }
            _ => {}
        }

        let mut current_x = config.x + dx;
        let current_y = config.y + dy;

        let mut glyphs_in_this_call = 0usize;
        let mut last_glyph_id = None;

        for (gid, cached_glyph) in text_glyphs {
            let render_x = current_x;
            let render_y = current_y;

            if self.pending_glyphs.len() < 4096 {
                let color_arr = config.color.rgba();
                self.pending_glyphs.push(GpuGlyph {
                    x: render_x,
                    y: render_y,
                    width: cached_glyph.width / self.scale_factor,
                    height: cached_glyph.height / self.scale_factor,
                    uv_min_x: cached_glyph.uv_min[0],
                    uv_min_y: cached_glyph.uv_min[1],
                    uv_max_x: cached_glyph.uv_max[0],
                    uv_max_y: cached_glyph.uv_max[1],
                    r: color_arr[0] as f32,
                    g: color_arr[1] as f32,
                    b: color_arr[2] as f32,
                    a: color_arr[3] as f32,
                    o_x: cached_glyph.o_x / self.scale_factor,
                    o_y: -cached_glyph.o_y / self.scale_factor,
                    _pad0: 0.0,
                    _pad1: 0.0,
                });

                self.current_glyph_count += 1;
                glyphs_in_this_call += 1;
            }

            if let Some(prev_gid) = last_glyph_id {
                current_x += scaled_font_px.kern(prev_gid, gid) / self.scale_factor;
            }
            current_x += scaled_font_px.h_advance(gid) / self.scale_factor + TRACKING;
            last_glyph_id = Some(gid);
        }

        if glyphs_in_this_call > 0 {
            self.push_batch(BatchType::Text, glyphs_in_this_call as u32);
        }
    }
}
}