System Status: click here.

Software Evolution

September 5, 2008, 7:19pm by jud

Those of us who have been around for awhile constantly joke about how “I remember building that 10 years ago” everytime some big “new” trend emerges. It’s always a lesson in market readiness and timing for a given idea. The flurry around Google Chrome has rekindled the conversation around distributed apps. Most folks are tied up in the concept of a “new browser,” but Chrome is actually another crack at the age old “distrbuted/server-side application” problem; albeit an apparent good one. The real news in Chrome (I’ll avoid the V8 vs. TraceMonkey conversation for now) is native Google Gears support.

My favorite kind of technology is the kind that quietly gets built, then one day you wake up and it’s changed everything. Google Gears has that potential and if Chrome winds up with meaningful distribution (or Firefox adopts Gears) web apps as we know them will finally have mark-up-level access to local resources (read “offline functionality”). This kind of evolution is long overdue.

Another lacking component on the network is the age-old, CS101, notion of event-driven architectures. HTTP GET dominates web traffic, and poor ‘ol HTTP POST is rarely used. Publish and subscribe models are all but unused on the network today, and Gnip aims to change that. We see a world that is PUSH driven rather than PULL. The web has come a looooong way on GET, but apps are desperate for traditional flow paradigms such as local processor event loops. Our goal is to do this in a protocol agnostic manner (e.g. REST/HTTP POST, XMPP, perhaps some distributed queuing model)

Watching today’s web apps poll eachother to death is hard. With each new product that integrates serviceX, the latency of serviceX’s events propegating through the ecosystem degrades, and everyone loses. This is a broken model that if left unresolved, will drive our web apps back into the dark ages once all the web service endpoints are overburdened to the point of being uninteresting.

We’ve seen fabulous adoption of our API since launching a couple of months ago. We hope that more Data Producers and Data Consumers leverage it going forward.

   Read More »




Building A Platform: one feature at a time.

September 4, 2008, 10:56am by jud

As Gnip evolves, we are amazed, almost daily, at what folks build using our API. My favorite thing each morning is reading the search.twitter.com feed for Gnip. Seeing how we’re being used is so cool, and much of the time we’re used in ways we didn’t anticipate. When that starts happening, you realize you’re building a Platform with a capital ‘P’.

A recent product that caught our eye was “tweetrush” (out of Cork, Ireland); http://tweetrush.com/. They’re using Gnip to build statistical information/visualization around Twitter. Look for more cool stuff from these guys; the current product is just the beginning!

The vast majority of products leveraging Gnip, do so to make existing aspects of their product better (e.g. lower the latency of activity notification), but it’s really cool when we see entire products built up around our offering.

As the wheel turns.

   Read More »




Incremental Collection Updates

August 22, 2008, 7:02pm by jud

The Gnip API supports incremental collection updates. We’ve supported this for awhile, but we didn’t do a good job communicating when it came out. Several folks are taking advantage of it, but over the past few days it’s become clear not everyone knows the functionality exists. Please see “Collection Updates” in the API doc for details.

   Read More »




Gnip, Mahalo, and L.A.

August 16, 2008, 3:11pm by jud

Eric and I just spent the past few days in Los Angeles. We were lucky enough to be invited as one of two presenters to Mahalo’s first hosted tech meetup; Google (Chris Schalk and Kevin Marks) represented the other slot with Open Social. Attendance of 175 left standing room only, and Jason Calacanis, Mark Jeffrey, and crew were great hosts; thanks for having us guys! You can view the event here. Lots of great business ideas in the LA area, but focus seems to be around media (no surprise) rather than technology. 75% of the attendees I talked to after the meetup were heavy technologists however, so clearly folks want more tech representation; hopefully Mahalo’s regular tech meetups can help facilitate.

While we were out there, we spent time with a dozen or so companies/people about Gnip. The discussions ranged from revenue opportunities, and integration details, to our product roadmap, and how folks want it to look. We left with renewed focus on our coming feature set, and the need to hire more great people.

Everyone wants full data (aka activity/message payload) in activities, extended meta-data (beyond the current “type”, “guid”, “uid”, and “at” fields), and meta-data normalization. On the activity normalization front, checkout the work going on at DiSo, and contribute where you can. Data Consumers obviously want a broader range of Data Producers as well, and that’s where Gnip’s polling infrastructure will come into play. We’re cranking to get as much of this done by end of this calendar quarter as we can. If you think you can help us, please send us a note!

   Read More »




Delicious 2 is yummy.

August 8, 2008, 11:00am by jud

Gnip now has Delicious v2 data flowing through it. The delicous bookmarking data flowing through the system now includes bookmarking/tagging done via delicious plugins/API tools (e.g. toolbar buttons). Nice, clean and pure stream of data from delicous now. Enjoy!

   Read More »




Garbage In, Garbage Out

August 2, 2008, 3:44pm by jud

Gnip is an intermediary service for message flow across disparate network endpoints. Standing in the middle allows for a variety of value adds (Data Producers can “publish once, distribute to many,” Data Consumers can enjoy single service interaction rather than one-off’ing over and over again), but the quality of data that Data Producers push into the system is fundamental.

Only As Good As The Sum Of Our Parts

Gnip doesn’t control the quality of the data being published to it. Whether it comes in the form of XMPP messages, RSS, or ATOM, there are many issues that can come into play that can affect the data a Data Consumer receives.

  • Bad transport/delivery - The source XMPP, RSS, ATOM, or REST, feed can go down. When this happens for a given Publisher, that source has vanished and Gnip doesn’t receive messages for that Publisher. We’re only as good as the data coming in. While Gnip can consume data from XMPP, RSS, ATOM, and other sources, our preferred inbound message delivery method is via our REST API. Firing off messages to Gnip directly, and not through yet another layer, minimizes delivery issues.
  • Bad data - As any aggregator (Friend Feed, Social Thing, MoveableType Activity Streams…) can attest, the data coming across XMPP, RSS, and ATOM feeds today is a mess. From bad/illegal formatting, to bad/illegal data escaping, nearly every activity feed has unique issues that have to be handled on a case by case basis. There will be bugs. We will fix them as they arise. Once again, these issues can be minimized if Data Producers deliver messages directly to Gnip via our REST API.
  • Bad policy - This one’s interesting. Gnip makes certain assumptions about the kind of data it receives. In our current implementation we advertise to Data Consumers that Data Producers push all public, per user, change notifications generated within their systems, to Gnip. This usually corresponds to the existing public API policies for said Data Producers. We will eventually offer finely tuned, Data Producer controlled, data policies, but for today’s public facing Gnip service, we do not want to see Data Producers creating publishing policies specific to Gnip. Doing so confuses the middle-ware dynamic we’re trying to create with our current product, and subsequently muddies the water for everyone. Imagine a Data Consumer interacting with a Data Producer directly under one policy, then interacting with Gnip under another policy; confusing. Again, we will, perhaps earlier than we think, cater to unique data policies on a per Data Producer basis, but, we’re not there yet.

While addressing all of these issues is part of our vision, they’re not all resolved out of the gate.

   Read More »




Gnip Is Hiring Engineers

July 30, 2008, 9:31am by jud

Checkout our job posting here. If you think you fit the bill, let us know; we want to talk to you.

Please no recruiter or 3rd party inquiries.

   Read More »




Three (Six?) Week Software Retrospective

July 24, 2008, 7:28am by jud

I had to go back into older blog posts to remind myself when we launched; July 1st. It feels like we’ve been live since June 1st.

Looking Back

Things have gone incredibly well from an infrastructure standpoint. We’ve had to add/adjust some system monitoring parameters to accommodate the variety of Data Producers publishing into the system; different frequencies/volumes call for for specialized treatment. We weren’t expecting the rate, or volume, of Collection creation we wound up with. Within three hours of going live, we had enough Collections in the system to adversely impact node startup/sync times. We patiently tuned our data model, and tuned TerraCotta locks to get things back to normal. It’s looking like we’ll be in bed with TerraCotta for the long haul.

Amazon

I’m not sure I could be any more pleased with AWS. Our core service is heavily dependent on EC2, and that’s been running sans issues. We’re working on non-Amazon failover solutions that assure un-interrupted service even if all of EC2 dies. Our backups are S3 dependent so we had some behind the scenes issues last weekend when S3 was flaky; see my previous post on this issue. We haven’t had our day in the sun with outages, and I obviously hope we never do, but so far I’m walking around with a big “I <3 AWS” t-shirt on.

Other

On the convenience library front, we (Gnip + community) have made all of our code available on github. We’ve had tremendous community support and contribution on this front; so cool to see; thanks everyone!

Collections are by far the primary data access pattern (as opposed to raw public activity stream polling); not really a surprise.

Summize/Twitter has been a totally cool way to track ether discussion around Gnip. When we notice folks talking about Gnip, positive or negative, we can reach out in “real-time” and strike up a conversation.

That’s all for now.

Thanks to all the Data Producers and Consumers that have integrated with Gnip thus far!

   Read More »




New Gnip data consumer: Retaggr

July 22, 2008, 12:55pm by eric

Just got word that the very cool London-based service retagger has started working with Gnip to reduce the number of calls they make to data providers.  Here’s what co-founder Nicholas Smit has to say:

Retaggr aggregates your online identity into an interactive embeddable business card. Contained within it is actual content from services like flickr, twitter and so on. We’re using Gnip to receive notifications about when our users publish data on these services, so we don’t have to poll them unnecessarily. This is great for efficiency, alleviates problems with API quotas, and helps us provide a consistent level of service to our users.

Welcome Nicholas and the Retaggr team!

   Read More »




New publishing partner: Muti

July 21, 2008, 12:57pm by eric

Muti is a social bookmarking site inspired by reddit and Digg but dedicated to content of interest to Africans or those interested in Africa.

Many thanks to Neville Newey and the Muti team for pushing notifications into Gnip.

   Read More »




Older Posts »