Startup anti-pattern: elephant hunting

As part of the continued series on startup anti-patterns, we look at elephant hunting: chasing big customers/deals.

First, two stories that highlight two different sides of elephant hunting.

In 2005, Meridio was guaranteed to win a deal worth $15m+. Meridio was a small electronic documents and records management (EDRM) startup whose software ran inside some of the world’s most secure organizations: from banks to oil & gas companies to branches of government and the military. One of its happy customers, the UK Ministry of Defense (MoD), was looking to modernize its infrastructure in a massive IT procurement worth billions. Each of the two integrator consortia shortlisted for the deal had designed Meridio into the solution. It was the largest secure SharePoint deployment in the world at the time: a great proof point of the quality and scalability of Meridio’s software. The future looked bright.

Meridio did win the deal and get the money in the end, but the process nearly killed the company:

  • The product roadmap and development prioritization became more complicated.
  • Supporting the two fiercely competitive integrator consortia required staffing up teams with semi-duplicated responsibilities: a significant distraction and increase in burn far ahead of revenue.
  • Once the MoD deal was awarded to one of the consortia, Meridio had many employees it couldn’t put to productive use quickly. The resulting layoffs impacted culture.

The UK MoD deal was important for Meridio — it influenced the 2007 sale of the company to Autonomy, now part of OpenText — but it was less impactful from a valuation standpoint than the company imagined it’d be. Winning the deal came at the expense of distraction and operational inefficiency, both of which affected growth in other areas of the business. Also, there never was another deal like it.

And now for story #2. In 2014 Life360 hit gold. After 18 months of lengthy negotiations, Life360 landed a $50m investment deal from ADT, the global leader in Home Security, coupled with a strategic joint product development opportunity that could net the company tens of millions of dollars in revenue. The team was dancing on rooftops!

In 2019, long after the commercial deal was dead in the water, Life360 decided to go public early (compared to its peers), and one of the considerations was ADT’s significant position as an investor in the company. Further, after years of development that sucked, at times, half of our engineering team’s bandwidth, the product we launched was discontinued and made no contribution to our business. When the company struck the deal employees were initially very excited. They believed that the organization they were working with would be as devoted to the strategic deal’s success as their small startup was. Three management team changes later, it became clear that the deal, which was one of the highest priority items on Life360 plate, was a pretty low priority for ADT. New execs at the company didn’t feel a real commitment to it, and a Private Equity acquisition coupled with organizational changes didn’t help much either.

Everything is easier in hindsight, but Life360 could have avoided this. Luckily, the deal didn’t end up being a company killer and the other parts of the business helped Life360 cement a great spot as a public company. It’s probably fair to say Life360’s success happened despite the ADT deal, not because of it.

What it is?

“Elephant Hunting” is a buzz term describing the practice of targeting deals with very large customers. For example, hunting an elephant in the context of a startup could be a seed-stage company targeting the likes of Google or AT&T as a customer in a million-dollar deal. These customers can provide large contracts, but they can be hard to catch and require large teams to tackle. With business-to-business (B2B) startups, there’s almost nothing more exciting (or seductive) than hunting and bagging an elephant-sized deal. It can produce huge revenue growth, provide you with highly leverageable customer references, and it’ll excite investors. Once you hunt down an elephant, it can feed many mouths (and egos) at the company for a long time. What could be better?

Be warned: the pursuit of elephants can be a dangerous game. If you fail to “kill the elephant” it might well be the one killing you. Unlike young and dynamic startups, elephants are organizational dinosaurs and striking a deal with an elephant will require your entire team — from sales to engineering — to engage with the elephant at different levels of the organization. This engagement happens over months, sometimes years. Even if you succeed in getting an elephant, you may get less benefit than you expected, as the cases of both Meridio and Life360 demonstrate.

Why does it matter?

Elephant hunting can bring your company down on its knees. Here are some perils to be aware of:

  • No repeatability. Elephants are hard to catch and often there aren’t enough of them. Meridio never found another UK MoD. Life360 never found another ADT.
  • Heavy operational burden. When you pursue and, later, land an elephant, it’s tempting to put all your resources into serving them. But this can lead to neglecting other clients and missing out on potential opportunities. Both Meridio and Life360 suffered operationally while selling and, later, servicing their respective elephants. Elephants may demand extended payment terms or lower prices, which can put a strain on a startup’s finances. It’s important to carefully consider the financial implications of taking on an elephant client.
  • Missed learning opportunities. When you and your team are laser-focused on one client you might be missing the forest from the trees. As a startup, you seek scalable solutions that matter to most potential customers you want to serve. More feedback is better, and getting feedback from just one elephant makes it harder to identify the scalable, repeatable, products that your target audience needs.
  • Overpromising and underdelivering. In the rush to impress an elephant, startups may make unrealistic promises they can’t keep. This can damage their reputation and lead to the loss of the elephant and future clients. Elephants have tall expectations for products and services delivered, as well as a web of requirements across legal, compliance, cybersecurity, etc. that smaller companies may be incapable of servicing well.
  • Compromising your identity. When a startup lands an elephant, it’s easy to become absorbed in their world and lose sight of your own identity and values. This can lead to compromises that go against your startup’s mission and culture. Note, for example, how many big tech companies have had to compromise to do business in China.
  • Losing control. Elephants may have their own demands and expectations that clash with a startup’s way of doing things. This can lead to a loss of control and autonomy, as the startup becomes beholden to the elephant’s whims. On the partner/channel side, this relates to the platform risk anti-pattern. In conclusion, while landing an elephant can be a huge boost for a startup, it’s important to be aware of the perils that come with it. By maintaining a balance, staying true to your values, and carefully considering the operating implications, startups can avoid the dangers of elephant hunting and build sustainable growth.

Diagnosis

Diagnosis is relatively straightforward. Here are a few signals that you might be spending too much time elephant hunting or are getting sucked into the Savannah:

  • Are you and your sales team spending most of your time focused on one deal with a big enterprise client? Has this been going on for an extended period?
  • Are you increasing spend ahead of revenue more than what you’d normally do for just one or two deals?
  • Is a significant chunk of your engineering team’s bandwidth focused on building custom features for one big customer? Does it feel like this customer is essentially dictating your roadmap for the foreseeable future? Do you find yourself having to promise steep SLAs and help desk hours that you know your existing team can’t support now or in the near future? Startups often do need to stretch to deliver, but if your team feels that servicing the elephants will consume the entire company, they’re probably right.

Misdiagnosis

A common misdiagnosis stems from not fully understand or realizing the scope and bandwidth consumption of Elephants. Often, it’s easy for the team to get excited about big deals and they tend to look the other way. Developing and delivering products to Elephants comes with significant overhead, longer sales cycles, lower win rates, and, often, requirements and standards that don’t make a positive impact on the joint outcome, but suck a lot of time and energy from everybody in the room.

Put together KPIs and tools to help you measure the impact elephant hunting has on your Sales and Engineering teams and make data-based decisions.

If your startup is investor-backed, remember that your job is to grow equity value. Revenue, profits and growth are pieces of how equity value is determined. Ask yourself whether the pursuit or even the winning of an elephant will have a meaningful positive impact on equity value given all the positive and negative externalities.

Refactored solutions

Once diagnosed, the refactoring of this anti-pattern very much depends on the set of challenges and opportunities your company has at hand. A few ideas on how to make the most out of Enterprise customers without consuming your entire (small) organization in the process:

  • Try to strike a smaller, multi-phase, deal with the Elephant. That would help both sides build confidence and capabilities to better serve each other.
  • (Artificially) Limit the resources devoted to elephant hunting. Be ruthless about this with your sales and bizdev folks. They’re likely to gravitate towards elephant hunting — these deals tend to be very exciting.
  • Continuously measure and analyze how much your team spends on custom work (especially non-repeatable deals and non-productizable work). It might put a strain on your relationship with the Elephant customer, but good Sales and Customer success teams can help strike a balance and set expectations.
  • Do you have enough slack to sign a deal with an Elephant? One good rule of thumb is assuming that deal will require twice as much resource and time compared to your original expectations. If that’s the case, would you still execute on the deal?

When it could help?

Does this mean you should never try to hunt elephants? No, but it does mean you should think very carefully about it, and be prepared to answer a few questions: 

  1. Where does elephant hunting fit in your sales and growth strategy; near vs. longer term; lower-hanging fruit vs. higher up your sales tree?
  2. How many elephants are there for you to hunt? Is that a real market niche for your business?
  3. Do you have the human resources to hunt and satisfy elephant-sized customers?
  4. Do your sales, engineering and customer success people have the skillsets and experience to satisfy this species of customer? 
  5. Does your CEO have the bandwidth and skill to take down the elephant? This strategy often demands an inordinate amount of the CEO’s time. Which of the CEO’s other responsibilities might suffer?
  6. Does your company have the financial resources to survive and thrive in the face of typically slow decision and purchase cycles? Will investors give you (relatively) cheap cash so that you can wait for the revenue?

For many startups, the transition to spending more time on Elephant hunting is part of the startup journey from childhood to adolescence. If you have good answers to the above questions, a more mature product that is ready to scale, you and your team might be ready to make the move, but tread carefully so you don’t end up being yet another victim on the plains of the Serengeti.

Co-authored with Itamar Novick. More startup anti-patterns here.

Posted in Anti-pattern, startups, Venture Capital | Tagged , , , , , | 2 Comments

More startup anti-patterns

It’s been a decade since I first assembled the list of startup anti-patterns, the repeatable ways startups waste time and money. The project has always been near and dear to my heart, as I often find myself directing entrepreneurs to the list and coaching about specific examples.

I’m partnering with my friend Itamar Novick from Recursive Ventures to add more anti-pattern content similar to what exists about ignorance and platform risk.

Avoiding startup anti-patterns is important. By wasting time and money, they radically increase the chance of failure. The impact of falling for an anti-pattern is especially heavy in an environment where investor cash is again quite expensive.

Posted in Anti-pattern, startups, Venture Capital | Tagged , , , | Leave a comment

Apache Spark native functions

There are many ways to extend Apache Spark and one of the easiest is with functions that manipulate one of more columns in a DataFrame. When considering different Spark function types, it is important to not ignore the full set of options available to developers.

Beyond the two types of functions–simple Spark user-defined functions (UDFs) and functions that operate on Column–described in the previous link, there two more types of UDFs: user-defined aggregate functions (UDAFs) and user-defined table-generating functions (UDTFs). sum() is an example of an aggregate function and explode() is an example of a table-generating function. The former processes many rows to create a single value. The latter uses value(s) from a single row to “generate” many rows. Spark supports UDAFs directly and UDTFs indirectly, by converting them to Generator expressions.

Beyond all types of UDFs, Spark’s most exciting functions are Spark’s native functions, which is how the logic of most of Spark’s Column and SparkSQL functions is implemented. Internally, Spark native functions are nodes in the Expression trees that determine column values. Very loosely-speaking, an Expression is the internal Spark representation for a Column, just like a LogicalPlan is the internal representation of a data transformation (Dataset/DataFrame).

Native functions, while a bit more involved to create, have three fundamental advantages: better user experienceflexibility and performance.

Better user experience & flexibility comes from native functions’ lifecycle having two distinct phases:

  1. Analysis, which happens on the driver, while the transformation DAG is created (before an action is run).
  2. Execution, which happens on executors/workers, while an action is running.

The analysis phase allows Spark native functions to dynamically validate the type of their inputs to produce better error messages and, if necessary, change the type of their result. For example, the return type of sort_array() depends on the input type. If you pass in an array of strings, you’ll get an array of strings. If you pass in an array of ints, you’ll get an array of ints.

A user-defined function, which internally maps to a strongly-typed Scala/JVM function, cannot do this. We can parameterize an implementation by the type of its input, e.g.,

def mySortArray[A: Ordered](arr: Array[A]): Array[A]

but we cannot create type-parameterized UDFs in Spark, requiring hacks such as

spark.udf.register("my_sort_array_int", mySortArray[Int] _)
spark.udf.register("my_sort_array_long", mySortArray[Long] _)

Think of native functions like macros in a traditional programming language. The power of macros also comes from having a lifecycle with two execution phases: compile-time and runtime.

Performance comes from the fact that Spark native functions operate on the internal Spark representation of rows, which, in many cases, avoids serialization/deserialization to “normal” Scala/Java/Python/R datatypes. For example, internally Spark strings are UTF8String. Further, you can choose to implement the runtime behavior of a native function by code-generating Java and participating in whole-stage code generation (reinforcing the macro analogy) or as a simple method.

Working with Spark’s internal (a.k.a., unsafe) datatypes does require careful coding but Spark’s codebase includes many dozens of examples of native functions: essentially, the entire SparkSQL function library. I encourage you to experiment with native Spark function development. As an example, take a look at array_contains().

For user experience, flexibility and performance reasons, at Swoop we have created a number of native Spark functions. We plan on open-sourcing many of them, as well as other tools we have created for improving Spark productivity and performance, via the spark-alchemy library.

Posted in Big data | Tagged , , , , , | 3 Comments

Unicorn pressures and startup failures

The startup anti-patterns section of my blog summarizes the repeatable ways startups waste time & money and, often, fail. Learning from startup failure is valuable because there are many more examples of failures that successes. (Anti-)Patterns become more noticeable and easier to verify.

For the same reason, it’s useful to read the failure post-mortems founders write. It takes meaningful commitment to discover the posts and to distill the key insights from the sometimes lengthy prose (an exercise in therapy at least as much as reporting of the facts). Luckily, there is a shortcut: the CB Insights summary of startup failures. It’s part table of contents and part Cliff Notes. It can help you pick the ones that are worth reading in full.

Some of the insights from post-mortems come from understanding the emotional biases of founders, CXOs and investors. In the uncertain startup execution environment these biases have the ability to affect behavior much more than in situations where reality is inescapable and readily quantifiable.

Speaking of emotional biases, Bill Gurley’s post on the Unicorn pressure cooker now that the magic has worn off is a must.

Posted in angel investing, Anti-pattern, startups, VC, Venture Capital | Tagged , , , , | Leave a comment

Advertising marketplace design

In the past decade several Nobel prizes in Economics have been awarded in the broader area of market (mechanism/auction/game) design. This is not surprising as the combination of Internet connectivity and ample computing resources are causing automated markets to pop up all over. One of biggest and fastest-growing in recent years as been the programmatic advertising market. For example, variations of the Vickrey–Clarke–Groves auction power the Facebook and Google ad exchanges.

When lots of players are lining up to feed at the advertising money troth, it sometimes becomes difficult to separate reality from marketing hype. The programmatic hype is that it brings efficiency to advertising (and does your laundry to boot). The reality is very different. While there are many benefits to programmatic advertising, it also causes and exacerbates many problems in the advertising ecosystem that hurt publishers, advertisers and consumers in the long run. The root cause is that the leading open programmatic protocol–OpenRTBfails to align marketplace interests. This is what happens when adtech optimizes for volume as opposed to quality.

Posted in Advertising, Digital Media | Tagged , , , , , | Leave a comment

Angel investing strategies

My friend Jerry Neumann wrote a great post on angel investing strategies, dissecting truth and myth about different betting strategies and sharing his own approach.

The question of luck came up and a commenter linked to my work on data-driven patterns of successful angel investing with the subtext that being data driven implies index investing. That’s certainly not what I believe or recommend.

The goal of my Monte Carlo analysis was to shine a light on the main flaw I’ve seen in casual angel investing, which is the angel death spiral:

  1. Make a few relatively random investments
  2. Lose money
  3. Become disillusioned
  4. Give up angel investing
  5. Tell all your friends angel investing is terrible

Well, you can’t expect a quick win out of a highly skewed distribution (startup exits are a very skewed distribution). That’s just math and math is rather unemotional about these things.

You can get out of the angel death spiral in one of two ways. You can take the exit distribution for what it is. In that case, you need many more shots on goal (dozens of investments) to ensure a much better outcome. Alternatively, you can try to pick your investment opportunities from a different, better distribution. That’s what I like to do and this is what Jerry is advocating.

The main influencer of return for angel investors is the quality of deal flow that you can win. Why? Because this changes the shape of your personal exit distribution and, in most cases not involving unicorn hunting, improves your outcomes at any portfolio size.

As an investor, you sell cash + you and buy equity. To see better deals and win them you need to increase the value of “you.” After all, anyone’s cash is just as good as everyone else’s. The easiest way to do this is via deep, real, current expertise and relationships that are critical to the success of the companies you want to invest in, backed by a reputation that you are a helpful and easy to work with angel. One way to maximize the chance of this being true is to follow some of Jerry’s advice:

  • Invest in markets that you know
  • Make multiple investments in such markets
  • Help your companies

There is a bootstrap problem, however, when new markets are concerned. How do you get to know them? Well, one way to do it is to make a number of investments in a new space. In this case, your investments have dual value: in addition to the financial return expectations (which should be reduced) you have the benefit of learning. Yes, it can be an expensive way to learn but it may be well worth it when you consider the forward benefits that affect the quality of your deal flow and your ability to win deals.

As an aside, I’ve always advised angels to not invest just for financial return. Do angel investing to increase your overall utility (in the multi-faceted economic theory sense) and do it so that it generates a return you are happy with.

In summary:

  1. Don’t attempt to pick unicorns as an angel.
  2. Where you can get high-quality deal flow you can win, do a smaller number of deals.
  3. Where needed, and if you can afford it, use higher-volume investing as a way to signal interest in a market and to learn about it so that you can get higher-quality deal flow.
Posted in angel investing, VC, Venture Capital | Tagged , , , | 4 Comments

JSON and JSONlines from the command line

At Swoop we have many terabytes of JSON-like data in MongoDB, Redis, ElasticSearch, HDFS/Hadoop and even Amazon Redshift. While the internal representations are typically not JSON but BSON, MsgPack or native encodings, when it comes time to move large amounts of data for easy ad hoc processing I often end up using JSON and its bulk cousin, JSONlines. This post is about what you can quickly do with this type of data from the command line.

The best JSON(lines) command line tools

There has been a marked increase in the number of powerful & robust tools for validating and manipulating JSON and JSONlines from the command line. My favorites are:

  • jq: a blazingly fast, C-based stream processor for JSON documents with an easy yet powerful language. Think of it as sed and awk for JSON but without the 1970s syntax. Simple tasks are trivial. Powerful tasks are possible. The syntax is intuitive. Check out the tutorial and manual. Because of its stream orientation and speed, jq is the most natural fit when processing large amounts of JSONlines data. If you want to push the boundaries of what is sane to do on the command line there are conditionals, variables and UDFs.
  • underscore-cli: this is the Swiss Army knife for manipulating JSON on the command line. Based on Node.js, it supports JavaScript and CoffeeScript expressions with built-in functional programming primitives from the underscore.js library, relatively easy JSON traversal via json:select and more. This also is the best tool for debugging JSON data because of the multitude of output formats. A special plus in my book is that underscore-cli supports MsgPack, which we use in real-time flows and inside memory-constrained caches.
  • jsonpath: Ruby-based implementation of JSONPath with a corresponding command line tool. Speedy it is not but it’s great when you want JSONPath compatibility or can reuse existing expressions. There are some neat features such as pattern-based tree replace operations.
  • json (a.k.a., jsontool): another tool based on Node.js. Not as rich as underscore-cli but has a couple of occasionally useful features having to do with merging and grouping of documents. This tool also has a simple validation-only mode, which is convenient.

Keep in mind that you can modify/extend JSON data with these tools, not just transform it. jsontool can edit documents in place from the command line, something that can be useful for, for example, quickly updating properties in JSON config files.

JSON and 64-bit (BIGINT) numbers

JSON has undefined (as in implementation-specific ) semantics when it comes to dealing with 64-bit integers. The problem stems from the fact that JavaScript does not have this data type. There are Python, Ruby and Java JSON libraries that have no problem with 8-byte integers but I’d be suspicious of any Node.js implementation. If you have this type of data, test the edge cases with your tool of choice.

JSONlines validation & cleanup

There are times when JSONlines data does not come clean. It may include error messages or a mix of STDOUT and STDERR output (something Heroku is notorious for). At those times, it’s good to know how to quickly validate and clean up a large JSONlines file.

To clean up the input, we can use a simple sed incantation that removes all lines that do not begin with [ and {, the start of a JSON array or object. It is hard to think of a bulk export command or script that outputs primitive JSON types. To validate the remaining lines, we can filter through jq and output the type of the root object.

cat data.jsonlines | sed '/^[^[{]/d' > clean_data.jsonlines
cat clean_data.jsonlines | jq 'type' > /dev/null

This will generate output on STDERR with the line & column of any bad JSON.

Pretty printing JSON

Everyone has their favorite way to pretty print JSON. Mine uses the default jq output because it comes in color and because it makes it easy to drill down into the data structure. Let’s use the GitHub API as an example here.

# List of Swoop repos on GitHub
API='https://api.github.com/users/swoop-inc/repos'
alias swoop_repos="curl $API"

# Pretty print the list of Swoop repos on GitHub in color
swoop_repos | jq '.'

JSON arrays to JSONlines

GitHub gives us an array of repo objects but let’s say we want JSONlines instead in order to prepare the API output for input into MongoDB via mongoimport. The –compact option of jq is perfect for JSONlines output.

# Swoop repos as JSONlines
swoop_repos | jq -c '.[]'

The .[] filter breaks up an array of inputs into individual inputs.

Filtering and selection

Say we want to pull out the full names of Swoop’s own repos as a JSON array. “Own” in this case means not forked.

swoop_repos | jq '[.[] | select(.fork == false) | .full_name]'

Let’s parse this one piece at a time:

  • The wrapping [...] merges any output into an array.
  • You’ve seen .[] already. It breaks up the single array input into many separate inputs, one per repo.
  • The select only outputs those repos that are not forked.
  • The .full_name filter plucks the value of that field from the repo data.

Here is the equivalent using underscore-cli and a json:select expression:

swoop_repos | underscore select \ 
    'object:has(.fork:expr(x=false)) > .full_name'

In both cases we are not saving that much code but not having to create files just keeps things simpler. For comparison, here is the code to output the names of Swoop’s own GitHub repos in Ruby.


require 'open-uri'
require 'json'
API = 'https://api.github.com/users/swoop-inc/repos'
open(API) do |io|
puts JSON.parse(io.read).
reject { |repo| repo['fork'] }.
map { |repo| repo['full_name'] }.
to_json
end

view raw

swoop_repos.rb

hosted with ❤ by GitHub

Posted in Code | Tagged , , , , , , , , | 5 Comments

My most favorite math proof ever

Math is beautiful and, sometimes, math becomes even more beautiful with the help of a bit of computer science. My favorite proof of all time combines the two in just such a way.

Goal: prove that the cardinality of the set of positive rational numbers is the same as that of the set of natural numbers.

This is an old problem dating back to Cantor with many proofs:

  • The traditional proof uses a diagonal argument: geometric insight that lays out the numerator and the denominator of a rational number along the x and y axes of a plane. The proof is intuitive but cumbersome to formalize.
  • There is a short but dense proof that uses a Cartesian product mapping and another theorem. Personally, I don’t find simplicity and beauty in referring to complex things.
  • There is a generative proof using a breadth-first traversal of a Calkin-Wilf tree (a.k.a, H tree because of its shape). Now we are getting some help from computer science but not in a way that aids simplicity.

We can do much better.

Proof:

Given a rational number p/q, write it as the hexadecimal number pAq. QED

Examples:

  • 0/1 → 0A1 (161 in decimal)
  • ¾ → 3A4 (932 in decimal)
  • 12/5 → 12A5 (4773 in decimal)

Code (because we can):

def to_natural(p, q)
 "#{p}A#{q}"
end

It is trivial to extend the generation to all rationals, not just the positive ones, as long as we require p/q to be in canonical form:

def to_natural(p, q)
 "#{p < 0 ? 'A' : ''}#{p.abs}A#{q}"
end

To me, this CS-y proof feels much simpler and more accessible than any of the standard math-y proofs. It is generative, reducible to a line of code and does not require knowledge of any advanced concepts beyond number systems which are not base 10, a straight, intuitive extension of base 10 positional arithmetic.

Note: we don’t need to use hexadecimal. The first time I heard this proof it was done in base 11 but I feel that using an unusual base system does not make the proof better.

Posted in Uncategorized | 5 Comments

Monitoring Redis with MONITOR and WireShark

At Swoop we use Redis extensively for caching, message processing and analytics. The Redis documentation can be pithy at times and recently I found myself wanting to look in more depth at the Redis wire protocol. Getting everything set up the right way took some time and, hopefully, this blog post can save you that hassle.

Redis MONITOR

The Redis logs do not include the commands that the database is executing but you can see them via the MONITOR command. As a habit, during development I run redis-cli MONITOR in a terminal window to see what’s going on.

Getting set up with WireShark

While normally we’d use a debugging proxy such as Charles to look at traffic in a Web application, here we need a real network protocol analyzer because Redis uses a TCP-based binary protocol. My go-to tool is WireShark because it is free, powerful and highly customizable (including Lua scriptable). The price for all this is dealing with an X11 interface from the last century and the expectation that you passed your Certified Network Engineer exams with flying colors.

To get going:

  1. WireShark needs X11. Since even Mac OS X stopped shipping X11 by default with Mountain Lion, you’ll most likely want to grab a copy, e.g., XQuartz for OS X or Xming for Windows.
  2. Download and install WireShark.
  3. Start WireShark. If you see nothing, it may be because the app shows as a window associated with the X11 server process. Look for that and you’ll find the main application window.

Redis protocol monitoring

WireShark’s plugin architecture allows it to understand dozens of network protocols. Luckily for us, jzwinck has written a Redis protocol plugin. It doesn’t come with WireShark by default so you’ll need to install it. Run the following:


mkdir ~/.wireshark/plugins && cd ~/.wireshark/plugins && curl -O https://raw.github.com/jzwinck/redis-wireshark/master/redis-wireshark.lua

view raw

gistfile1.sh

hosted with ❤ by GitHub

If WireShark is running, restart it to pick up the Redis plugin.

Now let’s monitor the traffic to a default Redis installation (port 6379) on your machine. In WireShark, you’ll have to select the loopback interface.

wireshark-startTo reduce the noise, filter capture to TCP packets on port 6379. If you need more sophisticated filtering, consult the docs.

wireshark-filter

Once you start capture, it’s time to send some Redis commands. I’ll use the Ruby console for that.


1.9.3p392 :001 > r = Redis.new
=> #<Redis client v3.0.4 for redis://127.0.0.1:6379/0>
1.9.3p392 :002 > r.set("key:5", "\xad\xad")
=> "OK"
1.9.3p392 :003 > r.get("key:5")
=> "\xAD\xAD"

view raw

gistfile1.rb

hosted with ❤ by GitHub

This will generate the following output from the MONITOR command:

1999[~]$ redis-cli MONITOR
OK
1369526925.306016 [0 127.0.0.1:55023] "set" "key:5" "\xad\xad"
1369526927.497785 [0 127.0.0.1:55023] "get" "key:5"

In WireShark you’ll be able to see the binary data moving between the client and Redis with the benefit of the command and its parameters clearly visible.

wireshark-view

Check out the time between request and response. Redis is fast!

Posted in Software Development | Tagged , , , , , | 1 Comment

Google and the ecosystem test

I am roaming the halls of Google I/O 2013 and wondering whether Google’s platform passes the ecosystem test.

… no platform has become hugely successful without a corresponding ecosystem of vendors building significant businesses on top of the platform. Typically, the combined revenues of the ecosystem are a multiple of the revenues of the platform.

So much activity but what’s the combined revenue of the businesses building on top of Android, Chrome & Apps?

Posted in Google | Tagged , , , | 3 Comments