OCPP WS IOocpp-ws-io
Core WebSocket RPC

System Design

Architecture, data flow, and clustering strategy.

This document details the system design of ocpp-ws-io, focusing on how it handles data flow, clustering, and validation.

1. Architecture Overview

ocpp-ws-io is designed to be transport-agnostic but primarily used with WebSockets. It acts as a high-performance RPC framework for OCPP.

Key Architectural Decisions

  • Stateless Core: The core library itself does not enforce any specific database or caching layer.
  • Adapter Pattern: Clustering and event distribution are handled via an EventAdapterInterface. We provide a RedisAdapter, but you can implement others (e.g., Kafka, NATS).
  • Manual Clustering: The library does not automatically shard state across nodes, but it handles global broadcasting via Redis automatically.
  • Robust Client Management: Sessions are persistent in-memory across reconnections (sticky sessions), and TCP Keep-Alives are enabled to prevent load balancer timeouts.

2. Connection Reliability & Life-cycle

We prioritize a stable connection even in flaky network conditions (e.g., cellular data).

Upgrade Pipeline

The WebSocket upgrade process is strictly controlled to prevent "hanging" or invalid connections:

  1. Validation: URL, identity, and Sec-WebSocket-Protocol are checked.
  2. Authentication: Sync/Async Basic Auth or Client Certificate (mTLS) check.
  3. Handshake Timeout: A configurable timer (default 30s) ensures auth logic doesn't block indefinitely.
  4. Session Restoration: If a client reconnects, their previous session state is restored.

Events

You can monitor the connection life-cycle with specific server events:

  • upgradeAborted: Fired if the handshake fails (timeout, auth failure).
  • client: Fired when the WebSocket is fully established.
  • close: Fired when the connection is cleanly closed or lost (after ping timeout).

3. Redis Data Flow (Clustering)

When using RedisAdapter, the system does not become a "shared state" cluster automatically. Instead, it allows instances to communicate.

Data Flow Diagram

Automatic Broadcast

When you call server.broadcast({...}):

  1. Local Send: The server iterates over all connected local clients and sends the call.
  2. Remote Publish: If an adapter is configured, it publishes to the ocpp:broadcast channel (prefixed, e.g., ocpp-ws-io:ocpp:broadcast).
  3. Remote Receive: Other nodes receive the message, check the source ID (to avoid loops), and send it to their own local clients.

Example: Using Broadcast

// On Node A and Node B
const adapter = new RedisAdapter({ pubClient: pub, subClient: sub });
server.setAdapter(adapter);

// This call reaches ALL chargers on ALL nodes
server.broadcast("Reset", { type: "Soft" });

// Send safely to a single targeted station (avoids unhandled rejections if disconnected)
const { success, result, error } = await server.safeSendToClient(
  "CP-001",
  "ocpp1.6",
  "GetConfiguration",
  { key: [] },
);
if (!success) {
  server.logger.error("Failed to get config from CP-001", { error });
}

Session Persistence & Robustness

To handle unstable networks (4G/LTE), the server keeps session data in memory even after a disconnect.

  1. Sticky Sessions: If a client reconnects with the same identity, their session object is restored.
  2. Garbage Collection: A background job removes sessions that have been inactive for > 2 hours (configurable).
  3. TCP Keep-Alive: Enabled by default to keep connections alive through load balancers.
  4. Ping Jitter: Native 25% randomized jitter prevents "Thundering Herd" CPU freezes when thousands of stations reconnect simultaneously.

Rate Limiting (Noisy Neighbor Protection)

To prevent firmware loops from overwhelming the CPU parsing engines (DDoS), ocpp-ws-io includes an advanced Token Bucket rate limiter directly at the socket layer.

  • Use global rate limits to bound maximum throughput per connected station.
  • Use method-specific rules to aggressively drop rapid-fire MeterValues while allowing StopTransaction through.
  • Define custom onLimitExceeded logic (e.g., dropping the payload or forcibly disconnecting the client).
const server = new OCPPServer({
  protocols: ["ocpp1.6"],
  rateLimit: {
    limit: 100, // 100 total messages...
    windowMs: 60000, // ...per 60 seconds.
    onLimitExceeded: "disconnect", // "ignore" | "disconnect" | Custom Function
    methods: {
      MeterValues: { limit: 10, windowMs: 60000 },
      Heartbeat: { limit: 2, windowMs: 60000 },
    },
  },
});

3. System Design Without Redis (Single Node)

For smaller deployments or development, you can run a single instance without any adapter.

Data Flow Diagram

Trade-offs

  • Simplicity: No external infrastructure dependencies.
  • Limitation: You cannot scale horizontally. If you add a second server, they won't know about each other's clients.

4. Best Implementation Practices

When building a centralized management system for production, avoid putting all your logic into a single server.on('client') block. The best architecture utilizes OCPPRouter, isolated authentication, and strict logging:

optimal-architecture.ts
import { OCPPServer } from "ocpp-ws-io";

// 1. Initialize Server with Global Structured Logging
const server = new OCPPServer({
  protocols: ["ocpp1.6", "ocpp2.0.1"],
  logging: {
    prettify: process.env.NODE_ENV !== "production",
    exchangeLog: true, // Auto-logs all IN/OUT metrics
    level: "info",
    handler: (entry) => {
      // Stream final JSON logs to Datadog/Sentry
      sendToDatadog(entry);
    },
  },
});

// 2. Define Isolated Authentication per Protocol/Path
const v1Router = server
  .auth(async (ctx) => {
    const token = ctx.handshake.headers.authorization;
    if (!token) return ctx.reject(401, "Missing Auth");

    // Validate token...
    ctx.accept({ session: { version: "v1.6", tenant: "ACME" } });
  })
  .route("/api/v1/chargers/*");

// 3. Attach Version-Aware Handlers to the specific Router
v1Router.on("client", (client) => {
  // Protocol inference matches the handlers
  client.handle("ocpp1.6", "BootNotification", async ({ params }) => {
    return {
      status: "Accepted",
      currentTime: new Date().toISOString(),
      interval: 300,
    };
  });

  client.handle("ocpp1.6", "Heartbeat", () => ({
    currentTime: new Date().toISOString(),
  }));
});

// 4. Implement a Catch-All Route for unknown endpoints
server.use(async (ctx) => {
  ctx.logger.warn(`Rejected unknown connection at ${ctx.handshake.pathname}`);
  ctx.reject(404, "Unknown Endpoint");
}); // Root route fall-through

await server.listen(3000);

Using this architecture ensures your authentication requirements, routing boundaries, and payload validation handlers remain completely modular.


5. Strict RPC Adherence (No Fire-and-Forget)

Unlike some other WebSocket libraries, ocpp-ws-io enforces strict Request-Response tracking for all outbound calls (client.call()). The library does NOT support a noReply: true "fire-and-forget" option for outbound messages, as this fundamentally violates the OCPP specification and leads to memory leaks or "Ghost Response" log spam when the station eventually replies.

If you need to send a command without blocking your main execution thread, simply omit the await keyword and use .catch() or rely on client.safeCall():

// ✅ DO: Non-blocking Fire-and-Forget natively in JS
client.safeCall("ocpp1.6", "Reset", { type: "Hard" }).then((res) => {
  if (res) console.log("Station accepted reset");
});

// Code continues executing instantly...

Idempotency Keys (Single Source of Truth Delivery)

When retrying an outbound call after a network timeout, you risk the station executing the command twice (e.g., performing a double Reset or RemoteStartTransaction).

To solve this, client.call() and server.sendToClient() accept an idempotencyKey. When provided, ocpp-ws-io overrides the dynamically generated message ID with your deterministic key. This ensures exactly-once execution semantics even across violent reconnections.

await server.sendToClient(
  "CP-001",
  "ocpp1.6",
  "UnlockConnector",
  {
    connectorId: 1,
  },
  {
    idempotencyKey: "unlock-txn-9981", // Guarantees identical MessageId on retries
  },
);

6. Data Validation Flow

Validation happens at multiple layers to ensure OCPP compliance and system integrity.

Validation Pipeline

Layers of Validation

  1. Protocol Layer: Checks if the message is a valid OCPP array [MessageType, MessageId, ...].
  2. Schema Layer:
    • If strictMode: true is enabled, the library validates the payload against the official OCPP JSON schemas.
    • To save CPU processing overhead on trusted hardware configurations, developers can use strictModeMethods: ["BootNotification"] to ONLY validate security-critical pathways while instantly bypassing heavy schemas for high-traffic events like Heartbeat.
    • Invalid payloads are rejected immediately with a FormatViolation or PropertyConstraintViolation.
  3. App Layer: Your code checks business rules (e.g., "Is this station authorized?").
// Enabling Strict Schema Validation selectively for maximum performance
const server = new OCPPServer({
  strictMode: true,
  strictModeMethods: ["BootNotification", "Authorize", "StartTransaction"],
  strictModeValidators: standardValidators,
});

On this page