Node.js 교재 · 10편 · Stream

Stream — 대용량 파일을 메모리 걱정 없이

1GB 파일을 통째로 메모리에 올리지 말고 청크로 흘리자.

6편에서 본 fs.readFile 의 함정 — 1GB 파일을 읽으면 1GB 메모리를 통째로 차지한다. 100명이 동시 접속해서 같은 일을 하면 100GB. 서버가 OOM 으로 죽는다.

해답이 Stream. 파일을 작은 청크(chunk) 단위로 나눠 읽고, 처리하자마자 흘려보내고, 또 다음 청크. 최대 메모리 사용량은 청크 크기 × 동시 처리 개수 — 1GB 파일도 16KB 메모리로 충분. Node 의 진짜 무기 중 하나.

1. Stream 3종 — Readable, Writable, Transform

Stream 은 데이터의 흐름 방향에 따라 세 가지가 있다.

종류	방향	예시
Readable	읽기 — 데이터를 내보냄	`fs.createReadStream`, HTTP request
Writable	쓰기 — 데이터를 받음	`fs.createWriteStream`, HTTP response
Transform	중간 가공 — 받아서 변형 후 내보냄	압축(gzip)·암호화·라인 분리

(엄밀히는 4번째 Duplex 가 있지만 실전에서 거의 안 쓴다.)

이 셋을 pipe() 또는 pipeline() 으로 레고처럼 연결하는 게 Stream 의 정수.

2. 파일 복사 — readFile vs createReadStream

같은 작업을 두 가지로.

// ❌ 1GB 파일이면 1GB 메모리
import fs from 'node:fs/promises';
const data = await fs.readFile('big.bin');
await fs.writeFile('copy.bin', data);

// ✅ 16KB 청크씩, 메모리 거의 안 씀
import fs from 'node:fs';
import { pipeline } from 'node:stream/promises';

await pipeline(
  fs.createReadStream('big.bin'),
  fs.createWriteStream('copy.bin'),
);

pipeline 이 두 stream 을 연결한다. 데이터가 청크 단위로 흐르고, 한 쪽이 빠르면 자동으로 속도를 맞춘다(backpressure). 같은 결과인데 메모리 사용량은 1만 분의 1.

3. Transform — 흐름 중간에 변환

실전 가장 자주 쓰는 패턴. 압축·복호화·라인 처리 등이 모두 Transform.

// gzip 압축
import fs from 'node:fs';
import zlib from 'node:zlib';
import { pipeline } from 'node:stream/promises';

await pipeline(
  fs.createReadStream('log.txt'),
  zlib.createGzip(),
  fs.createWriteStream('log.txt.gz'),
);

zlib.createGzip() 이 중간 Transform. 입력으로 받은 청크를 압축해서 다음으로 흘린다. 똑같이 zlib.createGunzip() 은 풀기.

4. HTTP 응답과 결합 — 큰 파일 다운로드

실전 — 사용자가 큰 파일을 다운로드할 때 메모리에 통째로 올리지 말고 즉시 Stream 으로.

import http from 'node:http';
import fs from 'node:fs';
import { pipeline } from 'node:stream/promises';

http.createServer(async (req, res) => {
  if (req.url === '/download') {
    res.setHeader('Content-Type', 'application/octet-stream');
    res.setHeader('Content-Disposition', 'attachment; filename=big.bin');
    try {
      await pipeline(
        fs.createReadStream('./big.bin'),
        res,                          // ← res 자체가 Writable Stream
      );
    } catch (err) {
      console.error('전송 실패:', err);
    }
  }
}).listen(3000);

핵심은 HTTP res 도 Writable Stream 이라는 것. 파일을 메모리에 올리지 않고 즉시 클라이언트로 흘리니까 응답이 빨리 시작되고, 동시 사용자가 많아도 메모리 폭발 없음.

Node 18+ 의 fetch + Response.body — 외부 API 에서 받은 데이터를 그대로 다운스트림으로 흘릴 때 (await fetch(url)).body 도 Web Stream. Readable.fromWeb() 로 Node Stream 으로 변환해서 pipeline 에 연결 가능.

5. async iterator — 모던 stream 소비법

Node 10+ 부터 Readable Stream 을 for await 로 직접 순회할 수 있다. 이게 가장 자연스러운 모던 패턴.

import fs from 'node:fs';
import readline from 'node:readline';

const rl = readline.createInterface({
  input: fs.createReadStream('access.log'),
  crlfDelay: Infinity,
});

let count = 0;
for await (const line of rl) {
  if (line.includes('ERROR')) count++;
}
console.log(`에러 ${count}건`);

1GB 로그도 라인 단위로 흘리면서 처리. for await 안에서 await 를 또 써도 자연스럽게 동작. 옛 data 이벤트 + end 이벤트 핸들러 패턴은 거의 안 쓰게 됐다.

요약 — 10편 좌표

여기까지 정리. Stream 의 핵심 두 가지 — 청크 단위 처리 (메모리 절약) + backpressure 자동 (속도 조절). 종류는 Readable·Writable·Transform 3종, 연결은 pipeline(에러 처리 자동), 소비는 for await 가 모던. 큰 파일 다루기·로그 처리·HTTP 응답·압축 모두 Stream 이 정답. 다음 편에서 바이너리 데이터 다루는 Buffer 를 본다.

다음 편 예고 — Buffer

바이너리 데이터를 다루는 Node 의 기본형. 11편.