Tell HN: YC companies scrape GitHub activity, send spam emails to users

Hacker News
February 26, 2026
AI-Generated Deep Dive Summary
Martin from GitHub highlights the growing issue of startups and other entities scraping Git commit data to gather user information, often for spammy purposes like sending unsolicited emails. This practice violates GitHub's terms of service, as it involves collecting personal details such as email addresses embedded in every Git commit. While GitHub has implemented measures like throttling API traffic and providing tools to anonymize commits, the problem persists due to the nature of Git itself, which inherently includes user information in commit data. The scraping phenomenon is not limited to startups but spans various actors across the tech ecosystem. Martin emphasizes that while it's technically feasible to mask email addresses in commits, many open-source projects require real identities, making this a complex issue. GitHub suggests using a "no-reply" email address for commits to maintain links to your account without exposing personal contact information. The broader implications of this issue are significant for both developers and organizations. For individuals, it underscores the importance of managing digital footprints and protecting personal data in collaborative environments. For companies, it raises ethical concerns about data collection practices and highlights the need for responsible engagement with open-source communities. In conclusion, the debate over data scraping from Git repositories touches on privacy, ethics, and the challenges of balancing openness with user protection. While GitHub continues to refine its policies, users are encouraged to adopt strategies like email aliases and careful commit management to safeguard their information in an increasingly interconnected tech landscape.
Verticals
techstartups
Originally published on Hacker News on 2/26/2026