Skip to main content

Command Palette

Search for a command to run...

Fixing Over-Engineering using just Postgres

Published
8 min read
Fixing Over-Engineering using just Postgres
A

Hi, Aviral this side — a young hobbyist software developer and a tinkerer from India. With a keen interest in Backend systems, Distributed/Cloud computing, Security, DevOps and Electronics - I enjoy hunting for bugs and challenging engineering problems. Code, keyboard's tak-tak(s), and coffee are my constant companions.

(for best experience read this on my blog - https://mraviral.in/blog/just-postgres )

Let’s be real for a second. We are all guilty of "Over Engineering", without even knowing sometimes.

You start a new project, and immediately your brain goes: "Okay, I need Redis for caching and rate limiting, Kafka/RabbitMQ for the message queue, and maybe Elasticsearch for search."

Suddenly, your simple Node.js app has four different infrastructure dependencies before you’ve even written a single line of business logic. It’s expensive, it’s a pain to deploy, and honestly? It’s usually overkill.

For 95% of applications starting out, PostgreSQL is all you need.

Today, we are going to simplify our stack. We are going to rip out Redis and RabbitMQ and replace them with raw, efficient Postgres features.


Part 1: The Rate Limiter (Ditching Redis)

Usually, people reach for Redis for rate limiting because it's fast. Why is it fast? Because it’s in-memory. But Postgres can be incredibly fast too, if you tell it to stop worrying so much about safety.

The Concept: What is the WAL?

To understand this, you need to know about the Write-Ahead Log (WAL).

By default, Postgres is paranoid about losing data. When you insert a row, Postgres writes that change to a "journal" (the WAL) before it even confirms the save. This ensures that if the power plug gets pulled, your data is safe.

When you send an INSERT command to Postgres, It doesn't write the data straight to the table file on the hard drive. Random disk writes are slow.

Instead, Postgres does something smarter but heavier:

  1. The WAL Writer: It first writes the change to a sequential log file called the WAL (Write-Ahead Log).

  2. The Promise: Only after this log entry is safely "fsync'd" (physically flushed) to the disk does Postgres tell your Node.js app, "Success! Data saved."

  3. The Checkpoint: Later, in the background, a "Checkpointer" process moves the data from memory to the actual table files.

But for a rate limiter (checking if an IP hit an endpoint too many times), we don't care if data is lost during a crash. If the server reboots and the counter resets, it’s not the end of the world.

Enter: Unlogged Tables

Unlogged tables tell Postgres: "Don't write to the WAL. Just write to memory/disk and go fast."

  • Pros: Insanely fast writes (comparable to Redis in many workloads).

  • Cons: If Postgres crashes, the table is wiped clean.

This is perfect for temporary data like rate limits.

Step 1: The SQL Setup

First, create the unlogged table using SQL:

CREATE UNLOGGED TABLE rate_limits (
    ip_address INET PRIMARY KEY,
    request_count INT DEFAULT 1,
    last_request TIMESTAMP DEFAULT NOW()
);

Step 2: The "Beast" Query

We need to do everything in a single round-trip to the database for speed. We don't want to fetch, check, and then update. We do it all at once using an Upsert.

Here is the helper function in Node.js:

const { Pool } = require("pg");
const pool = new Pool({
  connectionString: "postgres://myuser:mypassword@localhost:5432/mydatabase",
});

async function checkRateLimit(ip) {
  const limit = 100; // max requests
  const window = "1 minute";

  const client = await pool.connect();
  try {
    const query = `
            INSERT INTO rate_limits (ip_address, request_count, last_request)
            VALUES ($1, 1, NOW())
            ON CONFLICT (ip_address) 
            DO UPDATE SET 
                request_count = CASE 
                    WHEN rate_limits.last_request < NOW() - INTERVAL '${window}' THEN 1 
                    ELSE rate_limits.request_count + 1 
                END,
                last_request = CASE 
                    WHEN rate_limits.last_request < NOW() - INTERVAL '${window}' THEN NOW() 
                    ELSE rate_limits.last_request 
                END
            RETURNING request_count;
        `;

    const res = await client.query(query, [ip]);
    const count = res.rows[0].request_count;

    return count <= limit;
  } finally {
    client.release();
  }
}

What is happening in this query?

It looks scary, but it’s actually just handling three scenarios in one go:

  1. Try to Insert: If the IP is new, insert it with a count of 1.

  2. Conflict: If the IP exists, we trigger the DO UPDATE.

  3. The Logic (CASE): We check the last_request time against the current time.

    • Has the time window passed? Reset the count to 1 and update the timestamp to NOW().

    • Is it still within the window? Just increment the count (request_count + 1) and leave the timestamp alone.

This handles the reset logic atomically inside the DB. No race conditions.

Step 3: The Middleware implementation

Let's not make our controller dirty with this logic. Wrap it in a middleware so you can plug it into any route.

const rateLimiterMiddleware = async (req, res, next) => {
  // Note: If you use Nginx/Load Balancer, use req.headers['x-forwarded-for']
  const ip = req.ip || req.connection.remoteAddress;

  try {
    const isAllowed = await checkRateLimit(ip);

    if (!isAllowed) {
      return res.status(429).json({ error: "Too many requests. Chill." });
    }

    next();
  } catch (err) {
    console.error("Rate limiter error", err);
    next();
  }
};

app.get("/api/expensive-task", rateLimiterMiddleware, (req, res) => {
  res.send("You made it!");
});

Step 4: Maintenance (Garbage Collection)

Since UNLOGGED tables persist (unless the server crashes), this table will grow forever as new IPs visit your site. You need to take out the trash.

You don't need a complex cron job. Just a simple interval in your Node app will do:

// Run every 10 minutes
setInterval(async () => {
  try {
    await pool.query(`
            DELETE FROM rate_limits 
            WHERE last_request < NOW() - INTERVAL '10 minutes'
        `);
    console.log("Cleaned up old rate limit records");
  } catch (e) {
    console.error("Cleanup failed", e);
  }
}, 10 * 60 * 1000);

Extreme Scale: If you are doing 50,000+ requests per second, the overhead of Postgres connections and CPU context switching will become a bottleneck. At that point, Redis (which is single-threaded and purely in-memory) becomes the better choice.


Part 2: The Job Queue (Ditching RabbitMQ/Kafka)

Queues are tricky. If you have two workers trying to pick up a job from a SQL table, they might grab the same row at the same time. This creates race conditions.

The core problem (thundering herd)

Imagine you have a table of jobs and 5 worker processes.

If all 5 workers run a standard SELECT to find the next available job, they will all see the same row.

  • If you use standard locking, Worker 1 grabs the row, and Workers 2–5 hang/wait until Worker 1 is finished. This kills performance.

  • If you don't use locking, all 5 workers might try to process the same job, leading to duplicate emails, double charges, or corrupted data.

Traditionally, people use RabbitMQ to handle this "distribution" of tasks. But Postgres solved this years ago with a specific locking clause.

The Concept: FOR UPDATE SKIP LOCKED

This is the magic spell.

  1. FOR UPDATE: Locks the rows so no one else can modify them.

  2. SKIP LOCKED: This is the key. If a row is already locked by another worker, Postgres will skip it and give you the next available one.

It allows multiple workers to hammer the same table concurrently without stepping on each other's toes and without waiting.

Real-Time Updates: LISTEN / NOTIFY

A standard SQL queue "polls" (checks the DB) every few seconds.

  • Polling too fast? You spike your CPU.

  • Polling too slow? The user waits too long.

We can fix this using Postgres' native Pub/Sub mechanism: LISTEN and NOTIFY. The database literally pokes your Node.js app when a new job arrives.

Step 1: The Table & Trigger

CREATE TABLE job_queue (
    id SERIAL PRIMARY KEY,
    payload JSONB NOT NULL,
    status TEXT DEFAULT 'pending', -- pending, processing, done
    created_at TIMESTAMP DEFAULT NOW()
);

-- Function to notify when a row is inserted
CREATE OR REPLACE FUNCTION notify_new_job() RETURNS trigger AS $$
BEGIN
  -- 'new_job_channel' is the channel name
  -- NEW.id sends the ID of the new row as the payload
  PERFORM pg_notify('new_job_channel', NEW.id::text);
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- Trigger the function on insert
CREATE TRIGGER trigger_new_job
AFTER INSERT ON job_queue
FOR EACH ROW EXECUTE FUNCTION notify_new_job();

Step 2: The Worker (Node.js)

We need two connections here: one to Listen (which sits idle waiting for events) and one to Process (which does the work).

const { Client } = require("pg");

const listener = new Client({
  /* config */
});
listener.connect();
listener.query("LISTEN new_job_channel");

async function processJob() {
  const client = await pool.connect();
  try {
    await client.query("BEGIN");

    const query = `
            UPDATE job_queue
            SET status = 'processing'
            WHERE id = (
                SELECT id
                FROM job_queue
                WHERE status = 'pending'
                ORDER BY created_at ASC
                FOR UPDATE SKIP LOCKED
                LIMIT 1
            )
            RETURNING *;
        `;

    const res = await client.query(query);

    if (res.rows.length === 0) {
      await client.query("ROLLBACK");
      return; // No jobs available
    }

    const job = res.rows[0];
    console.log(`Processing Job ID: ${job.id}`);

    // ... execute your actual heavy logic here ...

    // Mark as done (or delete)
    await client.query("DELETE FROM job_queue WHERE id = $1", [job.id]);
    await client.query("COMMIT");

    // Check if there are MORE jobs immediately (don't wait for notify)
    // This clears the backlog if multiple jobs came in at once
    processJob();
  } catch (e) {
    await client.query("ROLLBACK");
    console.error(e);
  } finally {
    client.release();
  }
}

listener.on("notification", (msg) => {
  console.log("Event received:", msg.payload);
  processJob();
});

processJob();

The millisecond a row is inserted, Postgres fires the trigger, the Node app wakes up, processes the job, and goes back to sleep.

Performance Ceiling: This works beautifully up to a few thousand jobs per second. If you are handling millions of messages per minute, the overhead of MVCC (Multi-Version Concurrency Control) and vacuuming in Postgres (maybe we can discuss this in detail in next blogs) will eventually make Kafka a better choice.


The Conclusion

Does this scale to Facebook's size? No. Does it scale to the size of the startup or project you are building right now? Absolutely.

Using Postgres for these tasks reduces your "DevOps Tax."

  • One less service to monitor.

  • One less connection string to manage.

  • One less bill to pay.

Start simple. When your job_queue table hits 10 million rows a day, then you can migrate to Kafka. Until then, Postgres is enough.