Welcome to SPN

Register and Join the most happening forum of Sikh community & intellectuals from around the world.

Sign Up Now!
  1.   Become a Supporter    ::   Make a Contribution   
    Target (Recurring Monthly): $200 :: Achieved: $98

SciTech Spread the word: Math is the new sexiness in IT

Discussion in 'Breaking News' started by rajneesh madhok, Dec 18, 2011.

  1. rajneesh madhok

    rajneesh madhok India
    Expand Collapse
    SPNer Contributor

    Jan 1, 2010
    Likes Received:
    Spread the word: Math is the new sexiness in IT
    By Derrick Harris Dec. 16, 2011, 3:00pm PT 5 Comments

    “If you go to an eighth grader and ask them how many want to be applied mathematicians, not many hands will go up.”

    So says Dhiraj Rajaram, founder and CEO of Mu Sigma, a Chicago-based startup providing analytics (or “decision sciences” as the company calls it) as a service to a large pool of Fortune 500 customers. He’s probably right, and that’s a problem.

    We need math to improve sales … and manage servers

    Organizations, even large ones, might be masters of the fields in which they do business, but they’re not masters of applied mathematics, which is at the core of the growing data science trend. When it comes time to undertake a big data strategy that requires turning advanced algorithms on potentially massive data sets, many fast realize they don’t have, or have nearly enough of, the necessary skills internally. Attempts to hire these skills might prove largely fruitless as the small population of employees with the predicate acumen in both business and calculus are quickly snatched up by an equally small number of companies.

    Analyzing traditional business data held in a data warehouse is one thing, but doing big data and, more specifically, data science is quite another. McKinsey & Co. predicts that by 2018, the United States will have a shortage of “1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions,” and a shortage of almost 200,000 people with the deep analytical skils necessary for data science.

    But enough about big business data. In an era of webscale computing and large clusters running big-data workloads, we’ll also need more people who can apply mathematics to data with the goal of automating and troubleshooting distributed systems. Sure, predictive analytics can be great for determining how consumers are likely to react to changes to their favorite products, but they also can be very helpful in helping ensure that complex systems such as Google’s run smoothly.

    Case in point: Software bugs at Google

    On Wednesday, two members of the Google engineering team wrote a blog post explaining their surprisingly simple new algorithm for detecting particularly troublesome code. The problem, as authors Chris Lewis and Rong Ou present it, is that with such a large, growing and increasingly complex code base — and thousands of developers working on it — it becomes nearly impossible for code reviewers to identify “hot spots.”

    The authors define hot spots as code that “creates issues again and again, as developers try to wrestle with the problem,” as opposed to just a piece of necessarily difficult code to carry out a complex function. If it’s the former, reviewers need to be alerted to its status up front so they know to give it their utmost attention, or perhaps even hand it off to some with more experience.

    Based on some research on how to best predict if there are bugs within particular code, the team decided on a simple method for flagging files: “files are flagged if they have attracted a large number of bug-fixing commits, no more and no less.” How the algorithm goes about filtering commits down to only the valid bug fixes is a little more complex, of course. After discussing the results of early experiments with developers, the authors’ team also decided to work in a time variable so that newer bug-fixing commits score higher than old ones that might have already been dealt with.

    Their algorithm looks like this:

    As explained by the authors, “n is the number of bug-fixing commits, and ti is the timestamp of the bug-fixing commit represented by i.”

    Here’s what it looks like plotted:

    It’s relatively simple, but it’s not as if one can always just choose the simplest-possible algorithm and run with it. As author Chris Lewis noted in the Hacker News thread on his and Ou’s post, this algorithm came to be only after much experimentation with far more-complex algorithms to solve the same problem. He had “spent a lot of time trying to implement FixCache, a pretty complicated algorithm that looks at things such as spacial locality of bugs (bugs appear close to each other), temporal locality of bugs (bugs are introduced at the same time) and files that change together (bugs will cross cut these)” before coming across the research that led to the ultimate strategy.

    In another comment, Lewis suggested the future might involve some much more-difficult concepts, such as machine learning. “[W]e don’t have good tools (yet) to have a computer properly check the semantic meaning of our code,” he wrote. “Bug prediction sits as a sort of baby step. It’s the computer making a best-effort guess of where issues will be.“

    We’re just getting started

    Companies such as Google and Facebook are doing alright in solving some of their problems, but what they’re doing now is just the tip of the iceberg. As Lewis indicated, there’s real value in evolving their current efforts further, to the point where machine learning and other techniques will let computers do everything from review code to, perhaps, predict problems with overall system health. And as the next generation of web companies start scaling up, they’ll start running into their own unique systems issues that they’ll have to solve.

    Data science as it relates to business decisions is an obviously valuable area, and all the talk about big data probably ensures a fair investment in learning those skills. For organizations without internal skills, they can always outsource data science to companies such as Mu Sigma and Opera Solutions that exist to provide just such services. New, higher-level software products from startups such as Odiago, Platfora and others promise to alleviate some business-oriented analytic pain, as well.

    But applying data science to data about software code or webscale system activity doesn’t always have a direct connection to income, which means it doesn’t get talked about as much. Those skills, however, are arguably as important to our growing Internet economy as big business data is to companies of all types. Hopefully, the message gets out and teenagers start to realize that if they want high-paying jobs with the coolest companies around, they’d better get a lot more interested in math.

    If big data does indeed write the book about the future of business, Mu Sigma’s Rajaram says the climax will be “that mathematicians take the prom queen home.”

    Rajneesh Madhok
    • Like Like x 3
  2. Loading...

    Similar Threads Forum Date
    December 27 Shaheed of Sahibzaade... Spreading the Word Sikh Sikhi Sikhism Dec 24, 2012
    USA Sikhs spread the word about their faith Breaking News Nov 27, 2010
    Interfaith Sikh Santa is Spreading the Festive Love Interfaith Dialogues Dec 20, 2015
    When Turban Turns Messenger Diaspora Finds Interesting Ways to Spread Information About Sikhi Sikh Youth Nov 10, 2013
    India Delhi seethes in anger over child's rape, protests spread Breaking News Apr 20, 2013

Since you're here... we have a small favor to ask...

More people are visiting & reading SPN than ever but far fewer are paying to sustain it. Advertising revenues across the online media have fallen fast. So you can see why we need to ask for your help. Donating to SPN's is vote for free speech, for diversity of opinions, for the right of the people to stand up to religious bigotry. Without any affiliation to any organization, this constant struggle takes a lot of hard work to sustain as we entirely depend on the contributions of our esteemed writers/readers. We do it because we believe our perspective matters – because it might well be your perspective, too... Fund our efforts and together we can keep the world informed about the real Sikh Sikhi Sikhism. If everyone who writes or reads our content, who likes it, helps us to pay for it, our future would be much more secure. Every Contribution Matters, Contribute Generously!

    Become a Supporter      ::     Make a Contribution     

Share This Page