<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Streaky's Blog &#187; Paste2.org</title>
	<atom:link href="http://mybrokenlogic.com/tag/paste2/feed/" rel="self" type="application/rss+xml" />
	<link>http://mybrokenlogic.com</link>
	<description>Just another WordPress weblog</description>
	<lastBuildDate>Mon, 19 Jul 2010 21:50:26 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Running a Pastebin&#8230;</title>
		<link>http://mybrokenlogic.com/2009/11/09/running-a-pastebin/</link>
		<comments>http://mybrokenlogic.com/2009/11/09/running-a-pastebin/#comments</comments>
		<pubDate>Mon, 09 Nov 2009 16:25:18 +0000</pubDate>
		<dc:creator>streaky</dc:creator>
				<category><![CDATA[Paste2.org]]></category>
		<category><![CDATA[ddos mitigation]]></category>
		<category><![CDATA[high load]]></category>
		<category><![CDATA[mirroring]]></category>

		<guid isPermaLink="false">http://mybrokenlogic.com/?p=24</guid>
		<description><![CDATA[Is hard work sometimes. Paste2.org&#8217;s code is written to be fast, the problem with doing that is if I leave it alone for a day it can take large amounts of traffic that isn&#8217;t legitimate without really notifying me because the load doesn&#8217;t go high enough for the server to start alerting me that things [...]]]></description>
			<content:encoded><![CDATA[<p>Is hard work sometimes.</p>
<p>Paste2.org&#8217;s code is written to be fast, the problem with doing that is if I leave it alone for a day it can take large amounts of traffic that isn&#8217;t legitimate without really notifying me because the load doesn&#8217;t go high enough for the server to start alerting me that things are going wrong.</p>
<p>Take last night for example, I just happened to look at munin and I saw the first spike of this (the part with the big red updates block in the graph):</p>
<p><a href="http://mybrokenlogic.com/wp-content/uploads/2009/11/crawl-fail.png"><img class="alignright size-full wp-image-25" title="Fail" src="http://mybrokenlogic.com/wp-content/uploads/2009/11/crawl-fail.png" alt="Fail" width="495" height="343" /></a>This event which peaked at almost 400 queries/second (and if I tell you paste2.org hardly does any SQL queries, you&#8217;ll get why I was pretty pissed off when I noticed this), was pretty massive traffic comming from a lot of different IPs &#8211; which a lot of people would assume is a DDoS attack, I&#8217;m pretty sure is somebody trying to mirror the site.</p>
<p>If I may slide slightly off-topic for a second it&#8217;s a bit of a win for the much-hated query cache &#8211; look at the numbers of cache hits &#8211; when your MySQL server is set up right and your code is asking the right questions.</p>
<p>You&#8217;ll notice that the number of queries drops off at around midnight, this is the point when I noticed something is amiss and did something about it.</p>
<p>I have a script that scours the access log and adds the IPs it pulls out to an IPTables Chain, which, naturally, stops all inbound connections.</p>
<p>The problem is until about 5 minutes ago it was all manually ran, because in the past people have got the idea after a few rounds of that.</p>
<p>Not this time, note what happens after midnight &#8211; it slowly picks up again until it&#8217;s just as bad as it was. Now the whole thing for the last few minutes has been completely automated.</p>
<p>In case you&#8217;re wondering, whilst it&#8217;s nice having the site load tested, there&#8217;s two main issues: firstly nobody has ever asked if they can have the paste files, or told me why they want them all, and secondly &#8211; as you&#8217;ll see from the first part with all the updates, they were triggering the code which determines if they&#8217;re a robot or not and decides if they should update the last viewed date &#8211; which in turn determines when old posts should be deleted. That&#8217;s probably the worst part of people doing stuff like this &#8211; that it screws up the reliability of a system which is essentially a spam removal process. Legit posts that people need will be visited and kept, spam won&#8217;t be visited and thus get deleted after a time &#8211; all these posts are now marked as updated last night and the 95% that will be actually spam, will survive in the site for another 60 days.</p>
<p>I wonder how long it will be until these clowns get the message. Anyways, I can go back to my day job now the script is chugging away on its own.</p>
]]></content:encoded>
			<wfw:commentRss>http://mybrokenlogic.com/2009/11/09/running-a-pastebin/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Paste2.org Updates</title>
		<link>http://mybrokenlogic.com/2009/02/18/paste2org-updates/</link>
		<comments>http://mybrokenlogic.com/2009/02/18/paste2org-updates/#comments</comments>
		<pubDate>Wed, 18 Feb 2009 17:47:56 +0000</pubDate>
		<dc:creator>streaky</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[Paste2.org]]></category>
		<category><![CDATA[application server]]></category>
		<category><![CDATA[emacs]]></category>

		<guid isPermaLink="false">http://mybrokenlogic.com/?p=10</guid>
		<description><![CDATA[So I&#8217;ve been working on a major Paste2.org update for a while now. One of the major things I&#8217;m currently doing with it is making it work in the application server I described in my previous post. The improvements in performance will mean it&#8217;ll scale to the traffic increases it&#8217;s been getting for some time [...]]]></description>
			<content:encoded><![CDATA[<p>So I&#8217;ve been working on a major <a href="http://paste2.org/" target="_blank">Paste2.org</a> update for a while now. One of the major things I&#8217;m currently doing with it is making it work in the application server I <a href="http://mybrokenlogic.com/2009/02/18/back-in-the-game-with-1-swing/" target="_blank">described in my previous post</a>. The improvements in performance will mean it&#8217;ll scale to the traffic increases it&#8217;s been getting for some time to come without needing to upgrade it&#8217;s infrastructure. A few times paste2 has came very close to breaking into the top 10k sites in the Alexa rankings, and consistently hanging around the 20k mark, meaning it&#8217;s my most successful personally-owned site to date. To some people it might not be that impressive but I guess it&#8217;s a bit of a milestone for sites I personally own.</p>
<p>I&#8217;ve also either added or are working on adding a few new features.<span id="more-10"></span></p>
<div id="attachment_11" class="wp-caption alignright" style="width: 160px"><a href="http://mybrokenlogic.com/wp-content/uploads/2009/02/p2-new-screenshot.png"><img class="size-thumbnail wp-image-11" title="New Screenshot (paste2.org)" src="http://mybrokenlogic.com/wp-content/uploads/2009/02/p2-new-screenshot-150x150.png" alt="Screenshot of the new template, I know, the logo is horrible!" width="150" height="150" /></a><p class="wp-caption-text">Screenshot of the new template, I know, the logo is horrible!</p></div>
<p>The first one that&#8217;s pretty much done is diffing between pastes. Essentially you can diff between pastes and in the page you&#8217;d be able to quickly, on the click of one button, be able to show the diff of the two pastes without messing around entering numbers.</p>
<p>Secondly the site has a new theme that I&#8217;m pretty happy with now. It&#8217;s lighter and much cleaner, and will hopefully provide a better experience for users.</p>
<p>I also want to add (text) file uploading to paste2 when creating pastes. When you have a big file pasting it in a browser window can be really annoying.</p>
<p>Another major project I want to do is to have a remote API using probably <a href="http://en.wikipedia.org/wiki/SOAP" target="_blank">SOAP</a> for people that want to create tools for interacting with the site.</p>
<p>I&#8217;ll be trying to get a beta up as soon as I can but some current work commitments mean I can&#8217;t spend as much time on it as I&#8217;d like.</p>
<p>On the note of scripts interacting with paste2 I was contacted by <a href="http://www.emacswiki.org/emacs/AndyStewart" target="_blank">Andy Stewart</a> regarding the Emacs script he created for interacting with paste2.org. After a few emails back and forth I implemented a feature for getting raw content of pastes so they can be grabbed by <a href="http://www.emacswiki.org/cgi-bin/emacs/Paste2" target="_blank">his script</a>. I don&#8217;t use Emacs personally, but it looks useful.</p>
]]></content:encoded>
			<wfw:commentRss>http://mybrokenlogic.com/2009/02/18/paste2org-updates/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
