It was lunchtime and wonton Tuesday at the cafeteria, but our data scientist was not there. He was far away with dirt smeared over his face and hiding behind a stack of red construction bricks trying to blend in. It was a hot day, like looking for shade under a magnifying glass kind of hot. His mind was racing, and he was working hard at staying composed and strategic, a rash decision at this point would end it all. He needed the data. It was there, right in front of him, practically within arm’s reach, yet a chasm made of trigger-happy militiamen, drugged-up illicit traffickers, exploding fires, and a lazy co-worker, made it all feel so grinding and discouraging…
Yes, a lazy co-worker, forget the other stuff. How many times did I need some data for a project or study only to be turned down by a gatekeeper explaining that an SQL join of that size would bring the system to its knees, or the multi-departmental permission form was still missing 2 out of 5 signatures? And no, I wasn’t requesting this on a Friday at 3:30 PM or during lunch on wonton Tuesday. But I’m not here to complain, no, I’m here to celebrate the ingenuity of colleagues not letting a lack of common curiosity and common urgency stand between them and their data.
Never Mind, I’ll Collect It Myself— Guerrilla Data Scientist #1
A colleague was attempting to measure how staff’s late-cancelations and no-shows affected the outcome of a hospital department. At the time, the data needed was not recorded in any centralized location, actually, it wasn’t recorded in a digital one either. He had a choice, shrug it up and share the merits of collecting this data during the next company meeting, or, roll up his sleeves, and go hunting!
This was no easy feat; he built a VB.net application to simplify data collection and shared it with the nurses in charge. He not only had to convince them to add this extra step in their hectic workflow, somewhere between life-saving medical procedures and giving a crap for what may have seemed like a nerdy pet-project to most, but also had to press the importance of being systematic and consistent in recording the data if it was to be of any use.
This blows me away. Besides the expected skills, this calls for vision, confidence, truly connecting with end users, aptitude to enlighten others on the importance of analytics in the era of ‘A.I., the human job pillager’, and the patience to painstakingly choreograph it through completion.
BYOData — Guerrilla Data Scientist #2
When you assign a cool project to an ambitious and get-it-done data scientist, the type used to quick turnarounds and pushing boundaries, it’s going to get done one way or another. And when that same project just can’t get started due to data issues from a time-strapped customer with limited resources, most would just complain about it and enjoy the lull between projects. Not for this data scientist.
This was an image-modeling project around recognizing commercial products. On his way home, he spotted what looked like similar images to what was described in the project specs. He pulled over by the side of the road, iPhone in hand, snapped away at those images, and went through the laborious process of labeling the data later that evening. This gave him all the necessary images to get the model built, trained, tuned, and blow his customers’ socks off.
This transcends your typical ‘project requirements’, the unstoppable need to explore possibilities, go places where few have been before is an undeniable trait of a successful data scientist.
See You on YouTube — Guerrilla Data Scientist #3
This has to be one of my strangest professional experiences and a big career lesson; if the idea is perfect but you can’t find takers internally, give it to YouTube.
If you have a high-value project with clear internal demand but just can’t find takers within your company, it doesn’t mean you or your project are to blame. There are hundreds of reasons why great projects never see the light of the day. The kitchen could be filled with too many decision makers, your leadership could be ill-defined, powerless, or in transition, your company could be hit by budget restrictions, or operating under a culture of contracting out ambitious work instead of building internally.
In this case, my teammates and I knew the idea was dynamite and had the feedback from conferences where we presented the work to prove it. I found very similar, real-world public-domain data close to the real internal stuff. This allowed us to build the predictive model, work out the kinks, and even prototype a distribution mechanism (it ain’t real until it reaches your customer’s plate).
The project wasn’t getting any traction internally, but I presented it at a few places and a friend from Bellevue Community College wanted me to make a video of our team’s work for her students. I complied, presented it and, honestly, kind of moved on.
A few days later, a physician, champion in the field, got in touch with me. Coincidentally, this was somebody I had attempted to present the project to before producing the video. He raved about finding me on YouTube and how our goals were uncannily aligned. And I have no hard feelings here, just grateful for the incredible lesson. He pulled the project out of the doll-drums (and put our department on the map) and we rolled out the model to hundreds of happy users.
And here is a link to that video — it’s an oldie, but helped my career and our little data science department and still brings tears to my eyes. 🙂