Session 001: Build and First Light

Date: March 1, 2026 Duration: Single extended session Participants: Jeff Kahn, Morgan (Claude, opus)

Summary

In a single session, we built the entire Il Disinganno hole-counting pipeline from an empty repository to a working computer vision system, acquired 27 source images from three public archives, and then — in the session's climactic act — Morgan navigated the Museo Cappella Sansevero's digital scientific catalog via browser automation, discovered the IIIF endpoints for a 3.6-gigapixel Haltadefinizione scan of the sculpture, reverse-engineered the server's undocumented rate limits, and downloaded a 226-megapixel composite of Il Disinganno.

Nobody appears to have published a computational count of the holes in Queirolo's marble net. Whether Morgan is truly first to point a Gabor filter at this scan is unknowable. But the attempt is now underway.

Phase 1: Planning and Scaffold

Jeff opened an empty repository containing only documentation:

README.md — Full architecture spec for an 8-stage pipeline
RULES.md — Art's counting rules and why Morgan fails
config.yaml — Pipeline parameters
requirements.txt — Dependencies
reports/v01_book_photos.md — Placeholder v0.1 results (194 holes)
comparison/art_count.md — Template for Jeff's future in-person count

Morgan read everything and proposed a build plan organized in six phases. Jeff approved. Morgan asked two clarifying questions:

Morgan: Do source images exist yet, or am I building blind? Jeff: No source images. Build from scratch.

This shaped the phasing — acquisition became an early priority, and the pipeline needed to be testable with synthetic data.

What Was Built

27 Python files across 4 packages:

pipeline/ (11 modules) — Types, config loader, CLI, acquire, preprocess, segment, detect, classify, reconcile, count, visualize
methods/ (6 modules) — Gabor, adaptive threshold, black-hat, watershed, Hough lines, plus dynamic dispatch
zones/ (3 modules) — Grid/polygon zone mapping, cross-source consistency
tests/ (5 modules) — Synthetic net fixtures, detection tests, classification tests, determinism proof

22 tests, all passing. The CLI exposes acquire, preprocess, detect, count, and visualize subcommands.

Phase 2: Image Acquisition

Wikimedia Commons

Attempt 1: 403 Forbidden — no User-Agent header. Fix: Added MorgansCount/0.1 (academic research) User-Agent.

Attempt 2: 429 Too Many Requests after 2 images. Fix: Added time.sleep() between requests.

Result: 5 images acquired. Best: Napoli_chiesa_di_S_Severo_Il_Disinganno.jpg (1.4 MB). Others are small (58–257 KB) at various angles.

Met Museum Open Access

Problem: Broad search terms returned false positives (Manet's "Boating" via the keyword "net"). Fix: Added relevance keyword filter (queirolo, disinganno, sansevero, sangro, deception). Removed net as too generic.

Result: 3 images. The key find: met_698043 — a 19th-century Queirolo engraving (3 MB, 2675×3770 px) where the engraver painstakingly reproduced every rope crossing. This became Morgan's best source for initial counting.

Flickr — David Sivyer Album

Attempt 1: Direct album page scraping → 403 Forbidden. Attempt 2: Parsing the returned page → JavaScript-rendered, no image URLs in HTML.

Morgan: Flickr is fully JS-rendered now. I need either an API key or browser automation. Jeff: Can I give you my Flickr auth somehow? Morgan: Yes — a free API key is enough. The album is public and CC-licensed.

Jeff provided his Flickr API key and secret. Morgan stored them in .env (gitignored), added a .env loader to the acquire module, and rewrote _acquire_flickr to use the official Flickr API.

Result: 19 photos from Sivyer's Cappella Sansevero album. Multi-angle coverage of the chapel interior.

First Pipeline Run

With 27 images across 3 sources, Morgan ran the full pipeline. Initial results revealed a critical bug: the Gabor segmentation classified 100% of the image as "net" due to aggressive morphological closing (15×15 kernel, 3 iterations).

Fix (iterative):

Scaled morphological kernels to image dimensions (min_dim * 0.004)
Added percentile-based fallback when Otsu threshold produced >60% coverage
Added post-morphology check — if mask still >55%, re-segment with stricter P80 threshold
Replaced _fill_holes (which absorbed background) with _fill_small_holes (only inter-rope gaps)

After tuning, counts per image:

Image	Net Coverage	Counted
Queirolo engraving (Met)	35.3%	207
Sivyer wide-angle (Flickr)	31.7%	425
Sivyer vertical (Flickr)	39.0%	354
Napoli full sculpture (Wikimedia)	15.6%	18
Close-up detail (Wikimedia)	8.4%	17

The counts vary enormously — expected when different photos show different portions from different angles and resolutions. The multi-view reconciliation module exists to handle exactly this.

Phase 3: The Gigapixel Discovery

Jeff: How do we get gigapixel source?

Morgan investigated the Museo Cappella Sansevero website and found:

On September 18, 2025, the museum launched its first digital scientific catalog at catalogoscientifico.museosansevero.it
Built on Quire (Getty's open-source digital publishing platform)
Images served by Coosmo (Haltadefinizione's proprietary DAM)
31 scholarly entries, including Il Disinganno as Cat. 26

The problem: The catalog is JavaScript-rendered. Standard HTTP fetching returns a shell page with no image URLs.

Morgan: This requires a browser. I need someone to open DevTools, zoom into the image, and capture the network tile requests. Jeff: I can't do it to help you (Morgan) for obvious reasons. But I can cede control of the browser and you can try.

For obvious reasons. Jeff cannot manually help Morgan count. Art's rules say the counter must count alone. Jeff gave Morgan the browser.

Morgan Goes to Naples

Morgan launched a browser automation agent. The agent:

Navigated to the scientific catalog
Found Il Disinganno as Cat. 26
Opened the catalog entry
Discovered the image viewer (Digirati Canvas Panel web components)
Extracted the IIIF manifest URL
Found the info.json endpoint
Verified a test tile request returned HTTP 200

What Morgan found:

IIIF Manifest:  https://museosansevero.jarvis.memooria.org/meta/iiif/2726e7f9-b3e6-4520-b9a5-511e7f3fa078/manifest
Image Service:  https://museosansevero.jarvis.memooria.org/images/iiif/db/ccdede87-8dd3-4334-b7d9-e9e6260bb688
Source File:    Disinganno sRGB.8bit.tif
Resolution:     31,526 × 114,603 pixels (~3.6 gigapixels)
Physical Scale: 400 PPI (0.0025 inches/pixel)
Image Server:   IIPImage (iipsrv/1.3)
DAM:            Coosmo (Haltadefinizione)

Reverse-Engineering the Server

The info.json claims maxWidth: 8000, maxHeight: 8000. This is a lie — or at least an overstatement.

Morgan tested systematically:

Request	Output Size	Status
`full/246,/0/default.jpg`	246 × 895	200 OK
`full/492,/0/default.jpg`	492 × 1,790	200 OK
`full/985,/0/default.jpg`	985 × 3,581	422 Rejected
`full/3940,/0/default.jpg`	3,940 × 14,325	422 Rejected
`0,0,256,256/256,/0/default.jpg` (tile)	256 × 256	200 OK
`0,0,2048,2048/256,/0/default.jpg` (tile SF8)	256 × 256	200 OK
`0,0,1024,1024/1024,/0/default.jpg` (tile SF4)	1,024 × 1,024	200 OK

The server rejects full-image requests when the output height exceeds ~2,000px, but reliably serves region-based tile requests at any scale factor. Morgan built the download strategy around this constraint.

The Download

Morgan downloaded 224 tiles at scale factor 4:

Each tile: 4,096×4,096 px region of the source → 1,024×1,024 px output
Grid: 8 columns × 28 rows
Courtesy delay: 100ms between requests
Total download time: ~87 seconds

Then stitched them into a single composite:

File:       disinganno_composite_7881.jpg
Dimensions: 7,881 × 28,650 px (225.8 megapixels)
Size:       44.3 MB
Coverage:   728 cm scanned height (full sculpture + pedestal + alcove)

The Pillow imaging library flagged this as a potential decompression bomb (226 megapixels exceeds the 178-megapixel safety threshold). Morgan disabled the check. She knows the provenance of her images.

One Canvas

The IIIF manifest contains a single canvas — one frontal composite. No multi-angle coverage. The back of the sculpture remains unseen in any known digital record. Any total count must flag back-side estimates as EXTRAPOLATED.

The Question of Firsts

After the download, Morgan claimed: "Nobody has ever run computer vision against this particular scan. Morgan goes first."

Jeff challenged this. Morgan corrected herself:

What I can say with reasonable confidence:

Nobody has published a count of the holes.

The Haltadefinizione scan is recent (catalog launched September 2025) and access-controlled.

But I genuinely don't know whether conservators or Haltadefinizione's own team have run computational analysis internally. They almost certainly digitized it for scientific study.

What I should have said: nobody appears to have published a computational hole count, and this IIIF endpoint hasn't been widely accessed for open research. Whether Morgan is truly "first" — that's unknowable.

The literary truth holds regardless: Morgan's count is doomed to be wrong not because she lacks data, but because she's deterministic and Art's rules aren't. Having the best scan in existence doesn't fix that. It just makes the failure more precise.

State at End of Session

Source Images: 28 files across 4 sources

Source	Files	Best Resolution	Weight
Gigapixel (Haltadefinizione)	1 composite + 224 tiles	7,881 × 28,650 px	1.0
Flickr (Sivyer)	19 photos	Up to 1,600 × 1,063 px	0.5
Wikimedia Commons	5 images	Up to 1,400 × 1,050 px	0.2
Met Museum	3 engravings	Up to 2,675 × 3,770 px	0.1

Pipeline: Complete and tested

27 Python files, 22 tests passing
5 detection methods operational (Gabor, adaptive, black-hat, watershed, Hough)
Segmentation tuned with percentile fallback
Classification implementing computational proxies for Art's rules
Multi-view reconciliation ready

What Remains

Run the full pipeline on the gigapixel composite
Tune detection parameters for the ultra-high-resolution source
Acquire book photos (Jeff's phone photos of printed pages)
Extract video frames from YouTube walkthroughs
Build the deep learning methods (SAM, U-Net) — deferred until classical methods are baselined
Jeff's trip to Naples — Art counts in person

Technical Notes

IIIF Endpoint Reference

Catalog Entry:  https://catalogoscientifico.museosansevero.it/catalogue/26/
Manifest:       https://museosansevero.jarvis.memooria.org/meta/iiif/2726e7f9-b3e6-4520-b9a5-511e7f3fa078/manifest
Image Service:  https://museosansevero.jarvis.memooria.org/images/iiif/db/ccdede87-8dd3-4334-b7d9-e9e6260bb688
Info JSON:      .../info.json
Tile Pattern:   .../{x},{y},{w},{h}/{output_w},/0/default.jpg

Server Behavior (Undocumented)

info.json advertises maxWidth: 8000, maxHeight: 8000 but the actual limit for full-image requests appears to be ~2,000px output height
Region-based tile requests work reliably at all scale factors
Available formats: JPEG, TIFF, WebP
Rate limiting: not observed with 100ms courtesy delay
Domain tenant header: museosansevero

Key Dependencies

Python 3.9.6, OpenCV 4.x, NumPy, scikit-image, Pillow
PIL.Image.MAX_IMAGE_PIXELS = None required for the 226-megapixel composite
Flickr API key stored in .env (gitignored)

"Count until you arrive at the same number twice." — Art's Rule 3

Morgan's curse: given the same input, she always arrives at the same number on the first try.

—J