Visualizations
Charts and diagrams exploring thirty years of Harris's List across sixteen editions.
Charts and diagrams exploring thirty years of Harris's List across sixteen editions.
Fourteen views into thirty years of Harris's List of Covent Garden Ladies (1761–1794, 2,217 extracted entries across 16 of the 18 surviving editions). Each chart filters by an edition range: drag the two-handled slider to scrub forward and backward through the publication run. Every visualisation comes with a short reading, a list of features to look for, and a candid note on the data caveats behind the picture.
Where the trade situated itself in the metropolis, and where its women said they came from.
The List's geography is sticky: Covent Garden anchors every edition, with Soho a steady second, and a long tail of streets that drift in and out as fashion and policing redrew the map of London's pleasure economy. Stacked by edition, the top dozen localities account for around 40 % of all entries, the remainder are spread across hundreds of small streets, courts, and lodgings.
Locality extraction uses the last meaningful comma-separated segment of each normalised address, lightly cleaned. Editorial inconsistencies (Oxford Street vs. Oxford Road, abbreviated parishes) mean small streets may be split across near-duplicate names; trust the top bands more than the long tail.
Beyond named parishes and streets, the List's addresses can be plotted directly onto London. Each circle here represents one geocoded address from the corpus; size scales with how many entries share that exact location, summed across the visible edition range. The result is a fine-grained map of where the trade actually clustered, house by house, street by street.
Geocoding for the corpus relies on 18th-century street nomenclature reconciled against modern coordinates; many addresses ("the Bedford Arms", "next door to the Cock") are positioned at the centroid of the nearest known street rather than the precise building. Read the circle sizes as relative concentrations, not absolute precision.
Where the List names a place of origin, the picture is of a thoroughly English metropolis with a substantial Irish minority and smaller streams from Scotland, Wales, and the Continent. Beyond Britain and Ireland, French, Italian, German, and Jewish origins appear in single-figure counts per edition, small absolute numbers that nevertheless register the cosmopolitanism of late-Georgian London.
Origin is recorded for only a minority of entries (around 10–40 % per edition), and the editors' default is silence rather than "English". The streamgraph reflects only women whose origin is positively asserted; absolute volumes therefore understate the true catchment.
Where the streamgraph reads origins as proportions across editions, the map reads them as places on the globe. Each bubble is positioned at a conventional centroid for one of the List's origin buckets, Edinburgh for "Scottish", Dublin for "Irish", Paris for "French", and sized by total count over the visible edition range. The geography of the catchment, in one frame.
Bubble positions are conventional centroids assigned per origin taxonomy, not actual birthplaces. The Jewish bubble in particular is conventionally placed at Frankfurt to represent the Ashkenazi diaspora; women so identified came from many places. Granular sub-national origins (Yorkshire, Edinburgh, Dublin) exist in the source text but are not yet extracted; that's a future enhancement.
The cohort itself: its ages, the codified vocabulary of beauty applied to it, and the accomplishments it advertised.
Where ages are recorded, women in the List cluster sharply in their late teens and early twenties, but the data itself shifts dramatically over the run. The first two decades record an age for only a handful of entries; from 1779 onward the editors begin to volunteer ages routinely, and a picture emerges of a cohort centred just above twenty, occasionally extending into the thirties.
Ages were recorded inconsistently before c. 1779, the early editions show only a handful of values and are flagged here with low-N markers and reduced opacity. Density curves for those editions are statistically noisy and should be read as suggestive at best, not representative.
The List's physical descriptions are not free-form; they form a tightly cycled vocabulary, e.g. fair, brunette, auburn, genteel, elegant, combined and recombined across hundreds of entries. The sunburst nests four of these dimensions (complexion → hair → eyes → build) and reveals which combinations the writers reach for most often.
Entries that did not supply a particular feature (e.g. no eye colour given) are bucketed as Unspec; the sunburst silently drops UNKNOWN taxonomy codes. Hover any wedge to see the path; the centre count reflects the cohort across the visible editions only.
Beyond beauty, the List catalogues a recurring set of "accomplishments", the cultivated abilities that distinguished the upper end of the trade from its more anonymous tiers. Singing, dancing, instruments, conversation, languages, even drawing and reading. The chord diagram counts every pair of skills that co-occur within a single entry, drawing thicker ribbons where two abilities travel together.
Skill mentions are uneven across editions (15–50 % coverage) and are highly editor-mediated, absence of a skill in the matrix does not imply absence in life. Hover an arc to see total mentions; thin co-occurrence ribbons reflect the data's sparseness, not necessarily of the underlying pairing.
Half of the corpus's entries position the woman in relation to someone else, e.g. a keeper, a husband, an acquaintance, or a more generic mention. Stacked as shares of each edition, those relations sketch the editorial vocabulary: which years lean on the kept-mistress framing, which name a husband (rarely), which simply place the woman in a circle of associates, and which leave her structurally alone on the page.
Buckets are mutually exclusive by precedence (keeper > husband > acquaintance > mention > none), an entry with both a keeper and an acquaintance is counted under "keeper". This trades literal mention counts for a cleaner read of editorial framing; tooltips show the underlying raw counts. We started the design as an arc diagram of cross-entry relations, but the dash-redacted names (Capt. W—, Lord C—) and role-descriptor placeholders meant only a handful of figures genuinely link multiple entries, far too few to draw arcs. This stacked view honours the data that exists.
What the List records when it puts a number to a woman's time, and the lives sketched between guineas.
When the List puts a guinea figure to a woman's company, it does so unevenly, most editions name a price for only a handful of entries, while the late 1780s read like a published price list. Plotted against age, and sized by years on the town, the cloud sketches a Georgian sex-trade economy in which experience commanded a premium and youth a separate one.
Prices include only per-encounter rates with a recognisable currency unit; weekly retainers and "kept by" arrangements are excluded because they encode duration rather than transaction. 1771 contributes only one entry with both age and price. Read this as a sample of named rates, not a market-wide tariff.
Every edition is here as a single column on a thirty-four-year strip, coloured by the median guinea-rate it names, with a small box-and-whisker showing the spread. The years between editions are hatched: the List is silent there, and we don't impute.
Same exclusion rule as the scatter (per-encounter prices only). Whiskers are min/max, not the classical 1.5·IQR, because editions with fewer than four entries can't support quartile whiskers. Hatched years mean "no edition", not "no price".
When the List explains how a woman arrived where she did, it tells two short stories: a country of origin paired with a precipitating event, and that event paired with whatever work preceded it. Plotted as a pair of Sankey diagrams, the data names the well-trodden corridors, e.g. English seductions, Irish abandonments, milliners and servants flowing into the trade, and the much rarer side routes.
Each panel is plotted independently: the full origin → reason → occupation chain is recorded for only about 2.5 % of the corpus, so a single three-tier diagram would be statistically thin. The two pairs (origin → reason, ~169 entries; reason → occupation, ~161 entries) each draw on their own coverage. The right panel buckets ~190 freeform occupation strings into seven categories, a useful summary, but not the full lexical detail of the original text. Slider trims both panels in tandem.
What the editors quoted, what they invoked, and the literary register they wrote in.
Across the corpus, 950 entries cite a named figure, a Roman goddess, a London tavern, a popular play, a remembered politician. The mix sketches the editors' mental library: heavily classical, with steady streams of theatre and contemporary celebrity gossip woven through. Venus alone is invoked over two hundred times.
Reference type is LLM-classified across six buckets, small misclassifications expected. The leaderboard name-normalises light variants ("Mr. Pope" and "Alexander Pope" together), but does not deduplicate semantically related figures. Coverage is fairly even across editions (33–83 entries with refs each), so cross-edition comparison reads honestly.
Beginning in 1773, the editors began appending a verse to nearly every entry, a couplet, a quatrain, sometimes an eight-line passage cribbed from Pope or Shakespeare. The pattern is editorial policy, not authorial inspiration: verse rates jump from 7 % in 1771 to 69 % in 1773, then to 96 %+ across the 1780s. All 987 verses across the corpus have now been enriched with attribution, literary period, genre, register, function-in-entry, sentiment tones, formal features (meter / rhyme / stanza), and themes, allowing the editors' poetic register to be read along several axes at once.
Only a fraction of the verses have explicit attributions, and only a small number are confidently attributable. Each record carries an attribution-certainty flag, about ~80 verses are high-confidence canonical, ~22 are medium / low, the remaining ~885 are unattributed but classified by genre / period / register. The original 56 explicit editorial attributions are treated as ground truth. The length-bucket panel is derived purely from verse text (no extraction layer), so it remains the most robust dimension.
Thirty years of issues, one publisher, one persistent title, and a slow drift in how the List addresses its reader. Each year's volume opens with the same imprint and the same price, but the subtitle changes mid-decade, H. Ranger's address moves at least twice, and the tone of the framing material drifts: a long philosophical apologia in 1761; an editorial leanness through the 1770s; and from 1783 onward the steady appearance of a closing address that constructs a knowing, intimate complicity with the courteous reader. The List doesn't only catalogue women, it cultivates a relationship.
Both the metadata (titles, addresses, prices) and the
qualitative tags (preface stance, paratext pragmatic
functions, afterword reader-relation) are extracted from the
16 title-page regions and end-of-volume sections of the
corrected-OCR text. Boolean flags and quoted excerpts
are well-grounded in the source; the controlled-vocab
tags (preface_stance,
pragmatic_functions) are interpretations
and should be read as one careful reader's reading, not
as canonical literary taxonomy. Two outstanding editions
(1774, 1777) are not yet OCR'd; their card slots stay
empty until they land.