• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!


SF Project List

Page history last edited by Matthew F. Reyes 13 years, 8 months ago

Science Hack Day SF > Project List


Got an idea for a science hack? Got a brainwave for a mashup? Add it here. If you see an idea you'd like to hack/collaborate on, add your name to it!


What can you do?

Need ideas? Browse some of the ideas from London's Science Hack Day or add to/check out the ongoing list of science-related APIs, data and useful frameworks. Then, add your project idea for San Francisco's Science Hack Day below!




  • Socket2me by Kate Arkless Gray and Tantek Çelik
  • ... 


Ideas for hacks:

physics citations dataset

scientists family tree -- need developers!! 

world bank app

walk through universe

modeling gravitational lenses

integrating genomic + phenotypic data

automated gene function lookup

identify gene function in human genome sequence

near earth objects + arduino

space feed

science education + geoloqi mashup

universal dichotomous key

low cost science instrument robotics hacking

mendeley api + data

butter mind & coconut mind. can fat make you smarter?

sonification/visualization of particle physics data

grassroots mapping in palo alto

identifying the mutations that drive adaptive change in directed evolution experiments

CHDK for discovery-based science exploration

mashups using the PAL frameworks

the flavor of beer

Low Power Cloud Computing on recycled laptops

Wanted: Landing Sites for Team FREDNET's Lunar Mission

Scans to Searchable: citizen enthusiast aided framework

Tricorder Project: Tricorder is a cloud-based application that helps curious people collect and interpret the data needed to answer their questions.

We've got Open Gel Box, open PCR - what's next?

Free the Data: Press Release Images from NASA Planetary Missions

Teaching Genetics through games - **FANCY PIGEONS**

RF modeling the Moon

planet panel interface

A Christmas tree encodes personal DNA

Hack my Astrophoto

Chicken Little in your Smartphone

Make cool visualizations of 1,300 images of Comet Holmes scraped from the web




Science Crowdsourcing Platform


(tags: science, crowdsource, web, app engine)


After some great success with customized crowdsourcing (like Stardust@Home and GalaxyZoo) we thought it would be high time build a generic crowdsourcing platform that would lower the barrier to entry for science groups to ask questions of a dedicated citizen science community. Projects are created, questions formulated, and responses collected. All of this should be done in a visually appealing front-end. The real interesting aspect, we think, is that the site can provide analytics tools for science projects (as well as exposing raw data results to them) and provide connections between otherwise disparate projects ("If you liked 'classifying supernova' you'll love 'looking for Mayan ruins in Google Earth'...). We need experienced web designers, web framework folks, graphic designers to get this going!




  • Josh Bloom profjsb@gmail.com




  • ... 



Mendeley API & Data


(tags: science, papers, social)


We are a tool for managing academic papers, and we have built a unique social layer on top of this.  We now have a catalog of tens of millions of academic papers, with anonymised readership information about those papers.  We are building a recommendation engine, and a collaborative filter on top of this data.  You can get you hands on the data via our API (http://dev.mendeley.com), and we would love to see some people playing around with this. Perhaps people could weight the APS citation graphs with Mendeley readership data? We have just launched public groups, and some of that data should be available through the API by the time that the hack day kicks off. Two of my favourite groups so far are these two:(http://www.mendeley.com/groups/536621/creatively-named-research-papers/)(http://www.mendeley.com/groups/568281/physics-nobel-prizes/)




  • ...join in!




  • ...



Physics citations dataset announced


(tags: physics, social, publishing)


APS Data Sets for Research (September 16, 2010) - The American Physical Society (APS) is making available to researchers two data sets based on our journals Physical Review Letters, Physical Review, and Reviews of Modern Physics. This corpus comprises over 450,000 articles and dates back to 1893; its size and extent make it attractive for use in research about networks and the social aspects of science. The first data set consists of all pairings of articles in which one article cites another within the collection. The second set contains basic metadata about each article in the collection. Researchers may learn more about the data sets and request access to them by visiting http://publish.aps.org/datasets. Requests will be quickly reviewed and, if approved, the data will be made available for download. Questions may be sent to data-requests@ridge.aps.org.





  • ...




  • ...




Hacking DNA with OpenPCR: The Barcode of Life


(tags: bio, openpcr, PCR, dna sequencing, desktop)


Come join our chat in room M83 at 11:45 AM on Saturday!

You: biologist, naturalist, artist, designer, information lover, hardware hacker, biohacker




  • Tito Jankowski
  • Josh Perfetto
  • Patrik D'haeseleer, patrikd at gmail
  • ...join in!




  • ...


World Bank App Contest


(tags: apps, metrics, sustainability)


In October the World Bank will announce an app contest for global development, based around their open data initiative. It would be really cool to experiment with alternative metrics for progress or sustainability; online tools for evaluating them according to different indicators, let people submit their own, etc.




  • Jessy Cowan-Sharp, @jessykate


  • ...and jump in!




  • ...



A walk through universe


 (tags: art, universe, drawing) 


On a long (~100ft), fairly narrow strip of road/concrete/etc., draw every visible object in the universe. There is a nice chart put together on a log scale of every visible object compiled from many different astronomical datasets. The idea would be to spend a while drawing all these objects (obviously it's approximate at some places!) on a scale that would let people walk from the surface of the earth to the edge of the visible universe.






  • ...and jump in!  








Citizen science: Modelling gravitational lenses


(tags: code, gravitational lens, space) 


Some of the gravitational lens scientists have worked with galaxy zoo to get gravitational lens identification in the zoo but they have another challenge that is good for humans and hard for computers. What system of masses gives the observed lensing properties? They have some prototype code to allow people to easily build a set of masses and compare it to the observed lens, but it could be better built out for things like a dynamic scoring to give users more feedback as they model, etc.





  • ...and jump in! 




  • ...



Integrating Genomic Data with Phenotypic Data


build an interface linking genomic variants, genes & conditions with corresponding phenotypic data (100+ longitudinal data points from blood tests and blood pressure, weight, sleep quality, etc. measures) (tags: code, gravitational lens, space)






  • ...and jump in!  






Open-source resource for high-quality automated gene function lookup


a tool to query gene name & one-sentence succinct description for a list of hundreds of genes




  • ...




  • Should include conserved domains within genes as well


  • This project seems poorly defined. Are we talking human genes? Eukaryotic? Bacterial? Do you just want to know how to look up existing annotations, or do you want to tackle genes of unknown function? What level of functional annotation do you expect (and what makes you think you should be able to summarize it in a one-sentence succinct description). Why a list of hundreds of genes?


  • ...



Passive Wireless Data Display Appliance with Arduino


(tags: green, arduino, wireless, )


We all remember fondly the Sharper Image days where you could plop down $49.95 to have some grey-plastic box display sports scores or stock quotes (from constantly broadcast radio transmissions over FM...). Now think about getting a passive summary of anything you want at different times of the day---an appliance that glows green in the morning as your stock portfolio climes, red during the day as the local energy grid get taxed, and pulsating blue at night because that's just what makes you feel cozy. Science note: people have been shown to use less energy when they become aware of the local grid stresses. We're envisioning an extensible (and pretty) over-the-air mood appliance that changes colors based on user pre-defined web requests. It's a simple Arduino + wireless add-on + web framework + artsy overcoating.




  • Josh Bloom profjsb@gmail.com
  • Lisa Ballard @BasilLeaf 
  • Christopher Stumm @stumm 




  • ... 



Near Earth Objects + Arduino


Hook an arduino up to a Near Earth Object feed for it to blink/beep/etc. every time as asteroid flies by.  




  • Ariel Waldman, @arielwaldman (I don't think I'm going to be a judge, so count me back in!)


  • Nathan Bergey @natronics 


  • Jade Wang, @qiqing 


  • David Harris, @physicsdavid 
  • Ashish Mahabal, @aschig (mahabal.ashish at gmail.com)


  • ...and jump in!






Space Feed


A feed of awesome space events based on location.


Why the hack?

     Current astronomy sites are have complicated design or is overly technical for the casual observer. Typically lat/long is needed to figure out what is available in the night sky. Phones and browsers can take advantage of location and notification services to make the experience easy and passive for the user.


Visible with:

     - Naked eye

     - Binoculars

     - Telescope


Specify distance:

     - 5 miles, 15 miles, 25 miles, 50 miles, 100 miles etc.


Types of events:

     - Meteor showers

     - ISS flyover (transit, i.e. flying in front of the sun/moon)

     - Other satellites

     - Iridium flares

     - Auras

     - Planets

     - Constellations

     - ... more?

     - What else is interesting? How do you take a data set and classify an astronomical event "interesting" without human input? Can you?


Visibility considerations: 

     - Weather

     - Light pollution


Nice to Haves:

     - Best viewing spots in your area based on weather/altitude/other factors

          i.e. I am willing to travel X distance to see Y event. Return: "It's cloudy in San Francisco, go to Mt. Tam to see the Persieds meteor shower.

     - Localize measurements based on location or OS language/format.



     - Web based (ideal, open) OR

     - iPhone app (more possibility? visibility/distribution?)


Data sets:

     - ISS tracking data in web format - Created at London Science Hack day (XML file of ISS location): http://randomorbit.net/

     - TLE data for all satellites http://celestrak.com/NORAD/elements/master.asp 





  • Lindsay Eyink, @leyink, fem.bot@mac.com


  • Ben Ward, @benward


  • Jason Wilson, @fekaylius 


  • Ariel Waldman, @arielwaldman


  • ...and jump in!




  • ideas in place for data sources? formats? is this going to be real-time data? i'd be interested in building an audio perfume app on this feed 



Science Education + Geoloqi mashup


Geonotes around science education 




  • Kishore Hari, @sciencequiche


  • Amber Case, @caseorganic


  • Aaron Parecki, @aaronpk


  • Jun Yin, @junnibug 


  • Kevin Rohling, @sundriedcoder 


  • Jade Wang, @qiqing 


  • Jason Wilson @fekaylius 
  • Ashish Mahabal (mahabal.ashish at gmail.com)





  • Thanks, Ariel!
  • Wow, that was really cool. I think Peter from Office Space would be pretty pleased with this technology -- it could just do your work for you as you move around, so you can just, like, zone out. - @peteforsyth 



Universal Dichotomous Key


Dichotomous keys in biology let you identify a species by answering a series of linked yes-no questions. But the usually start out assuming you already have a good idea what it is, for instance you can get a key for species of ants. But what if you needed guess a species from scratch? Start with is it alive? Is is multicellular? etc until you get to a species.  Making it api-able makes it useful for future projects like graphing biodiversity based on traits or computerized biology.




  • Nathan Bergey @natronics 


  • ...and jump in!




  • This is almost certainly far too ambitious.


  • Nested dichotomous trees?


  • ...



Low Cost Science Instrument Robotics Hacking


We can do some cool stuff with robotics at a low cost these days.  Devices like the arduino, wiimote, and Android phones have some awesome capabilities.  What kind of science can we do with these platforms at a low cost?  Imagine you had a small rover with an arduino, wiimote, and Android phone... which already have many sensors on board.  What kind of science can we do (besides just driving around) with them along with some low cost additions?  If we could do things like spectroscopy or 3D mapping with a platform that was affordable to a focused group of high school students, we would enable a huge number of young students to take their first steps into science and engineering with a very low barrier to entry.  In addition, when capable low cost robots become commonplace, what new avenues are opened?




  • Matt Everingham @matt808


  • Jade Wang @qiqing  


  • Matthew F. Reyes @motorbikematt


  • Guy Pyrzak


  • Patrik D'haeseleer, patrikd at gmail


  • ...and jump in!




  • @matt808 will bring some hardware to hack on.


  • @matt808 is planning to have the beginnings of a spectrometer available which uses a common USB web cam as the sensor.  The main task for this will be writing the software that takes the (I think) jpg output from the webcam and processes it to get a graph.


  • Many digital SLR cameras can be USB-controlled (via http://gphoto.org, eg).  Instead of a moving robot, you could build a telescope mount that can point to any place on the night sky, or track the stars for long exposures.  You can use http://astrometry.net to figure out where you're pointing -- use it for feedback or fine control.  (disclaimer, I run astrometry.net) --Dustin Lang




Butter Mind & Coconut Mind. Can fat make you smarter?


Will eating one of these fats improve your math performance?  Based on Seth Roberts' butter and math study, recently presented at a Bay Area Quantified Self Show & Tell, during which Seth ate half a stick of butter each day and performed better in math, we expect the answer to be yes.


Seth was able to reduce his time by 30 milliseconds.  Will others who try a similar experiment experience the same change? 


In the Butter Mind study, to be run from October 23 - November 12, I will test the hypothesis that butter improves math performance.


This study is meant to mimic Seth Robert's study, with the addition of a coconut oil group.  Many thanks to Seth for his advice and help getting this started!


Why the addition coconut oil?  I have a pet theory that the cognitive enhancement Seth received may be from the high concentration of Medium Chain Triglycerides in butter, also present in coconut oil, which has been linked to positive effects on those with Alzheimer's Syndrome.  Seth has not tried coconut oil, so cannot report on its effects on his math scores.


Obviously, no study is perfect - and this one is no exception!  It's a test I was interested in trying myself after seeing Seth's presentation -- but I realized it would be far more fun and interesting to include others!  This will be fun for me, and I hope for you, too. At the very least, will get data from a group over a 21-day period, but we may even get a few curious surprises. 


I am currently looking for Butter Mind participants, who will perform a math test daily for 21 days and be in one of the following groups: butter eaters, coconut oil eaters, and controls, who will eat no additional fat but will perform the same math test as the fat-eaters.


To qualify for the study, you must be willing to eat 4TBS of butter or coconut oil (sticking to the same one) - or nothing extra - for 7 days and do a 32 problem simple math test for 21 days.  You must have access to the internet to submit your scores.


Study details:


-       Participants will be randomly selected to be in the Butter, Coconut Oil, or Control group


-       The study will take place for 21 days: from Oct 23 - Nov 12


-       The study will be divided into 3 sets of 7 days


  • Part I. Oct 23 - 29: Perform simple math quiz daily  + No additional fats


  • Part II. Oct 30 - Nov 5: "Fat." Perform simple math quiz daily  + Butter OR Coconut Oil. For Controls, just the quiz.


  • Part III. Nov 6 - 12: Perform simple math quiz daily  + No additional fats


-       Non-control participants will ingest 4 Tablespoons of either Butter or Coconut Oil during the "Fat" portion of the study


-       Participants will be asked to share lifestyle information before the study and asked to join an online group to track their data.  Extra sharing (thoughts, epiphanies) is encouraged but optional.


Additional details:


-       New results will be posted to the QS blog throughout the study


-       You can join for the 21 days section or just the Science Hack Day fun. Sign up below!


For more information or to join:


Email Eri Gentry: sap.ved.eri@gmail.com




  • Eri Gentry @erigentry


  • ...and jump in! 




  • ...



Identifying the mutations that drive adaptive change in directed evolution experiments


Dataset: we have 8 completely resequenced genomes that need to be parsed, this is Illumina data
- We can force cells to evolve under extreme selective pressure on very rapid timescales (days - weeks).
- With the advent of massively parallel sequencing techniques we can resequence the genomes of evolved cells for a few hundred dollars
- This now allows us to use evolution as a discovery tool. But we need computational tools to cut quickly to hypotheses about which changes to the genes or genome may be responsible for the evolution we observe.
Project flows are required that integrate multiple disjointed bioinformatics tools and quickly parse gigabites of sequnce information to identify potentially interesting changes.
Examples of important information:
Where do we see chromosomal amplifications? Where exactly are the edges of these amplicons?
Can we identify point mutations, nucleotide changes, in the genome?
Are these point mutations within genes? If so, do they cause an amino acid change? If so, do we have any information about that sequence and its likely function before and after the change?
If the mutations are not within genes, do they affect known transcriptional, or translational control motifs? Do we create or destroy a transcription factor binding motif?




  • Liam Holt, ljholt@gmail.com


  • Meredith Carpenter, carpenter@berkeley.edu 


  • Jun Yin, @junnibug 

  • Patrik D'haeseleer, patrikd at gmail


  • ...and jump in! 




  • I assume these are bacterial genomes? Can you tell us which species, so we can come prepared?



Sonification of particle physics data (particle physics wind chime)


(tags: physics, sonification, data visualization)


I am working with particle physics data from the BaBar experiment, which ran at SLAC/Stanford from 1999-2008 (http://www-public.slac.stanford.edu/babar/). I am interested in visualizing the particle's interactions with the detector as well as representing those interactions aurally. The different detectors have attributes which, conceptually, might map onto musical characteristics (pitch, volume, timber, etc.). I have access to simulated data and it would be sweet to put together a stand-alone program or web interface/API that allows the user to define their own detector/sound maps. In this way, we turn the detector into a wind chime which makes visible/audible these very subtle, but elemental, processes which take place during the experiment. 


I am also interested in using the data for education and outreach efforts and this project may help define the challenges in making this type of data available to schools and the general public. While I am not currently involved with any experiments at the Large Hadron Collider, a successful demo of this type of effort may be intriguing to that community. 




  • Matt Bellis,  mbellis@stanford.edu


  • David Harris, @physicsdavid 


  • ...and jump in!




  • ...



Grassroots Mapping in Palo Alto


(tags: art, maps, cameras, communities)
Using super low-cost tools, let's make an open-source map of the Science Hack Day campus. Grassroots mapping was created on the premise that communities should be able to create maps to support land claims made by their residents "at extremely low cost." Using digital cameras, helium balloons, and hacked memory cards, let's create an open source map of Science Hack Day.
Fore more info, check out: http://grassrootsmapping.org
    •    Stephanie Vacher, @awesome
    •    Eden Sherry, scihackday@eden2.com, iPhone as data collection device


    •    ...maybe you? :]
    •    Is anyone familiar with the various methods for reconstructing 3D scenes from photographs? Here are some links to explore:








CHDK For Discovery-based Science Exploration


The Canon Hackers Dev Kit turns cheap point-and-shoot cameras into programmable tools. Let's extend the platform to include data-logging and environmental sensing. Think of the cameras as cheap, smart, self-contained multi-parameter sensors that kids can use to explore their environments.




  • ... Come on in... 




  • ...



Mashups using the PAL Frameworks


The DARPA PAL program (the Personalized Assistant that Learns) is focused on improving the way that computers support humans through the use of cognitive systems—that is, systems that reason, learn from experience, and accept guidance in order to provide effective, personalized assistance.


DARPA recently released some of the PAL technologies for use by the research community including documentation of the APIs, providing examples and user guides, and addressing third party license issues. The full information is available at at http://pal.sri.com


It would be fun to explore some of these technologies and see what sort of mashups and analysis can be accomplished. I have no current specific ideas...


Some example APIs:


C2RSS helps a user monitor information streams (e.g., RSS feeds) by identifying documents that are of potential interest to that user. The system automatically sorts information by topic and relevance, leveraging user feedback to refine suggestions.


CALO Express (CE) is a lightweight personal desktop assistant that uses learning technology to identify relevant information on the workstation. CE finds and organizes information in email, on a workstation, and on shared drives, to quickly assemble new presentations and to prepare for meetings by gathering relevant information. CE provides a plug-in architecture to easily insert java or C# code for new learning algorithms, user interfaces, and application logic.


iLink is a social networking component for building a dynamic topic model for each user that is used by the recommendation engine to suggest related people, or information artifacts. iLink’s FAQtory allows a community to ask and answer questions and to build a FAQ repository.


Probabilistic Consistency Engine (PCE) combines possibly conflicting evidence from a collection of data sources into a most probable hypothesis consistent with evidence. PCE takes as input a collection of facts and weighted rules and generates the marginal probabilities of individual atoms and formulas, using mechanisms based on Markov Logic Networks.


Task Learning (TL) enables parameterized tasks to be learned from demonstrated examples, modified by the user, and subsequently executed to perform related tasks. In this way, TL can enable end-user automation of routine or time-consuming tasks, thus freeing user time to focus on more cognitively demanding activities.


The Workflow Activity Recognition and Proactive Assistance (WARP) system provides the capability to recognize user intent by matching observations of user actions to predefined models of workflows. WARP employs Logical Hidden Markov Models (HMMs) to support the recognition and tracking of hierarchical, interleaved workflows in domains that may involve hundreds to thousands of objects.




  • Brett Heliker @bheliker


  • ...and jump in! 




  • ...



The Flavor of Beer (and/or chocolate?)


I would like to create an ontology of flavor preferences and then use that as an exploration of ways to transfer sensory data (taste, smell) between persons. To limit scope, focused on my favorite beverage: Beer. Of course, it could be extended to wine, steak, food in general, animals, movies, etc.


Flavor is somewhat qualitative, but as wine tasters will attest, there are ways of quantifying. We shall define a system for classifying and categorizing flavor based on the flavor wheel developed in the 1970s by Morten Meilgaard, adopted as the flavor analysis standard by the European Brewery Convention, the American Society of Brewing Chemists, and the Master Brewers Association of the Americas.


This system should give a solid feel for the flavor profile so that any random person can guess, without tasting, if it is a beer they will like or not. Extended, this identifier could place it in a spectrum of beer flavors that can be used to determine taste, food pairings, preference, etc. It will give you a general idea of the character of the beer.


Further exploration into graphical representations of this flavor ID could be explored. Similar to the 33Beers flavor wheel but even more complex: At a glance, will I like this beer? Can I transfer to you a few bytes of data that you can instantly recognize as a taste, smell, or preference?


*This should not be a judge of quality, merely a description of the flavor.




  • Brett Heliker @bheliker
  • Patrik D'haeseleer, patrikd at gmail


  • ...and jump in! 




  • There's a nice opportunity to do something similar to this with chocolate as well. The people at TCHO in San Francisco (http://www.tcho.com/) have put a lot of thought and effort into flavor profiles, with a simple 6-component flavor wheel (citrus/fruity/floral/earthy/nutty/chocolatey). They also do a lot of good work empowering their cocoa growers (see http://www.tcho.com/tcho-is/tchosource), setting up local "Flavor Labs" and educating them on how the growing conditions and processing affects the quality of the end product (it takes some fairly specialized equipment to produce the finished chocolate, so most cocoa growers have never tasted tasted the end product of their labor, and therefore wouldn't even know how to produce a better bean in the first place.) If we approach them nicely, the people at TCHO might be willing to work with us, either by sharing some of their data (we'd probably have to bring a really solid case for that, and maybe sign something), or more likely, by sharing some of their chocolates. This could be as simple as taste-testing their six flavors of chocolates and correlating that to other food preferences (pepsi vs coke? coffee vs tea? vegetarian or meatarian? candy of chips?...). Or as complicated as studying the effect of weather patterns and fermentation conditions on the flavor profile of the resulting beans.


  • ...



Low Power Cloud Computing using recycled laptops


Getting a cloud setup on a public service like Amazon Web Service's Elastic Cloud Computing (EC2) platform isn't too hard. After all, you just sign up for EC2, provide payment, and -- a few clicks later -- your cloud is running.


Setting up and running your own private cloud setup inside your firewall can be just as easy and --even better-- completely free to run.  Using recycled laptops as servers can provide a low power Cloud Computing option.  This hack proposes to test if these recycled laptops can also produce the performance need for cloud computing.  Or more computing per watt of energy!  The tools we will use to accomplish this feat are all found on with Ubuntu Enterprise Cloud (UEC), which is found in the  Ubuntu Server 10.10 distribution.  We will take 5 recycled Toshiba Tecra M2 (Pentium M 1.7 GHz, 1.5 MB RAM, 60 GB HDD) load Ubuntu Enterprise Cloud on each as a controller and nodes, then network them.





Wanted: Landing Sites for Team FREDNET's Lunar Mission


Hi all:


Team FREDNET is the open source competitor in the Google Lunar X PRIZE challenge...We were recently awarded a contract by NASA Johnson Space Center to go forth and obtain "innovative lunar demonstrations data" for them! Excited to be a part of Science Hack Day. Here are a couple of challenges for y'all:


1. Help us find a good landing site!


I will give you an introduction about our little lander and rover's capabilities at the event- will provide info on engineering constraints with respect to landing on the Moon. Then we shall proceed to dig through some recently released NASA data available online and find our lander a nice place to land...


2. Come up with science experiments for our mission...


Ah! But there is a catch, you see...Your experiment shall be of "zero mass, zero volume and NOT made from unobtainium!" I will provide a description of the vehicle and systems we have got on board, and I challenge you to come up with some neat science that can be performed using what we have. One of my favorite engineering anecdotes is about how Auguste Piccard, the Swiss physicist, ran into the problem of having to run control cables out of a pressurized balloon cabin through pipes (i.e. holes!)...He came up with the clever solution of attaching a u-tube to the hole on the inside of the cabin and pouring in mercury, which sealed the hole and doubled up as a barometer!! That's the spirit!


Come and visit us some time at the Hacker Dojo, Mt. View, CA (we host SuperHappySpaceDev, usually on Saturdays 4-7 PM)




Hackers: 1. Bala Ramamurthy @rocket_bala



Scans to Searchable: citizen enthusiast aided framework


Non-English books out of copyright (and languages not supported by DP)


This is an idea that is similar in concept to what Distributed Proofreaders are doing for Project Gutenberg, but is slightly wider in scope. It can really benefit from getting the work that DP have already done, but there was no response to an email to them.


Here is the idea:


  • start with a scanned book


  • pages are in PDF, tiff etc.


  • served one at a time to enlisted citizen enthusiasts


  • they OCR or type these out


  • returned to the database


  • Useful especially for non-English books


  • OCRed/typed page could be unicode, font, transliterated


  • one can be transformed to another, final stage being searchable unicode


  • Then:


  • separate enthusiasts recheck for errors etc.


  • Plus:


  • dealing with images, quotes, italics, bold, formulae


Higher agent keeping tab of which pages, which books are done Full books stored/served from some location.


The idea came about because there is a 20000-page encyclopedia in Marathi which the Government is ready to release copyright on if one can convert it to unicode. About 15000 of those pages are scanned PDFs, and there does not exist a reliable Marathi OCR. The remaining 5000 pages are in pagemaker and can be rendered in html with some fixing needed using a few firefox addons and a bit of scripting.


There are dedicated communities (part of different forums and bulletin boards) who are ready to work on such projects if only a framework is available.


Developing an OCR is another attractive idea, but may not fit in the scope of this hackday.


Project Gutenberg
Distributed Proofreaders
Marathi encyclopedia - official site
Sample converted pages from Volume 18
Project Rastko - seems to cater mainly to Europe

Digital Library of India

Wikisource projects' side-by-side OCR edit pages



* Project Gutenburg -> Internet Archive (archive.org, openlibrary.org, Books In Browsers…) -> Wikisource. The need for (1) automated OCR tools in non-English languages (does Internet Archive serve non-English? need followup) and for (2) good tools and processes for manual conversion of page images to useful text.


Contact: mahabal.ashish at gmail.com for joining or info.

or Pete Forsyth

or Lindsay



 Tricorder Project




Tricorder is a cloud-based application that helps curious people collect and interpret the data needed to answer their questions. 


Tricorder structures observations, helps people chose the right data to record, visualize that data and provides a community of people to share and discuss results with. 


Simple enough to be used in an elementary science class but broad enough to explore both qualitative personal questions and quantitative scientific questions, Tricorder is for citizens, scientists, students, and the generally curious.


Background and Rationale:


Data is the currency of scientific thinking and scientific communication.  The keystone of contemporary scientific literacy is data comprehension.  Data comprehension is ability to understand the relationships between questions, recorded observations (i.e. data) and conclusions.  For example, data comprehension includes: understanding how the limits of observation constrain and inform our questions and conclusions; what data is most relevant to a question?  What is the best way to present data to support a conclusion?  How reliable are given observations?


The best way to promote data comprehension is to encourage average people to collect and discuss data related to their own questions about the everyday world.  Engaging people in the act of data collection and interpretation compels them to encounter the very questions and limitations faced by scientists studying everything form global warming to kitty litter. Having encountered these questions with data of personal interest they will be better able to comprehend data of general importance.


The ability to collect, structure and explore, relevant, reliable data has long required large amounts of expensive, bulky equipment.  Ten years ago it would be unreasonable to expect an average person to carry a notebook, graph paper, a calculator, audio recorder, camera, GPS, accelerometer, photometer, compass, spectrograph, level, oscilloscope, decibel meter and a weather station.  Today, however, all of these devices and more live in your smartphone.


While people today have the unprecedented ability to record high quality data at any point during their day they lack a framework to structure these observations.  The goal of Tricorder is help average people utilize their capacity to make, interpret and share rigorous observations about a wide array of topics. It guides them in their decision about what data to collect, reminds them to make regular observations, provides launch points for visualization, and connects them to a community of observers.




  •   Open Source allows the community to add advances functionality.


  •   Multiple users level modes and preset for quick use


  •   Decision tree to help select data to collect


  •   Connects to an array of online data sources: metrological, astronomical, financial etc.


  •   Connects to currently available personal data collection sites like Daytum and Your Flowing Data


  •   Public and private data sets


  •   Geotagging allows users to discover other observations in a area and guides them to relevant local factors


  •   Scheduled reminders for regular observation


  •   Popup questions to collect personal data


  •  Image alignment to help with time-lapse imagery, image normalization and subtractions


  •  Communicates with peripheral devices


  • Works on the web, a local application and phone.     


Goals for Science Hack Day:


  • I’m looking for developers, scientists, and others interested in helping bring this idea to life.


  • General Concept exploration


  • Feasibility


  • Hacking some crude functionality using existing services.


Posted by Alan Rorie / Almost Scientific  @almostsci 



We've got Open Gel Box, open PCR - what's next?


(tags: DIYbio, openpcr, hardware hacking, science)


The DIYBio community has already come up with some essential pieces of lab equipment, including a gel electrophoresis box (see http://openwetware.org/wiki/DIYbio:Notebook/Open_Gel_Box_2.0 and  http://www.pearlbiotech.com/), and open PCR (see http://openpcr.org/ and http://www.lava-amp.com/). So what should we tackle next? Which piece of lab equipment is within reach of the DIY builder, and would make the biggest impact on a DIY bio lab?


Here are some suggestions:


- A Shaker platform. Very basic, somewhat essential, but frankly a little boring, IMHO. Biggest challenge might be to design something that can run continuously for a decent amount of time without burning out, creating fatigue fractures, or shaking itself apart.


- A Liquid Handling Robot. A much more ambitious project, but well within reach for a DIY project, considering that there are already open 3D printer XYZ platforms that have a much finer resolution than would be needed for, say standard 96-well or even 384-well plates, at a price of around $600-$2000 (see MakerBot, RepRap). Some enterprising spirits have even started liquid handling robot designs, built using LEGO Mindstorm:


BioBrick-A-Bot: Lego Robot for Automated BioBrick DNA Assembly


Liquid handling robot


- Flow Cytometry?


- Epifluorescence Microscope? Typically requires a strong wide-spectrum light source, dichroic mirror, and a bank of filters for each wavelength. Instead, we could use a rotating mirror or a disk with slits to briefly illuminate the specimen, and then image the emitted light less than a microsecond later. Use a series of different colored LEDs (including UV) to illuminate, so you can scan the wavelength of the incoming light. Image using a cheap color microscope. You should be able to image multiple fluorophores at a time, by cycling through the LEDs and correlating the incoming spectrum with the rgb value of the emitted light.


(More ideas to be added...)




  • Patrik D'haeseleer, patrikd at gmail


  • ...and jump in!  



Press Release Images from NASA planetary missions


There are hundreds (thousands?) of press release images from robotic space missions coupled with compelling copy describing each image. They live on the JPL website and various mirrors at research institutions.(see http://photojournal.jpl.nasa.gov/Help/ImageGallery.html or the more scrapable: 


) Non commercial use is OK as long as a short credit is attached (think it's really even looser than that, public domain). To my knowledge there is no easy way to use them en masse short of scraping those websites. 


These are just the planetary mission images. There are other examples of NASA images that are free to use, tied up inside of web pages.


Idea for a weekend hack would be to scrape the images and content and create a database or and/or an API so that developers can more easily access the data. After that maybe build a few simple demo apps.


A few ideas:


  • a game, getting to know your solar system, maybe guessing or identifying what planetary body you're looking at


  • different ways to browse the images - a more comfortable web browsing experience, would love to see something for iPad or other mobile devices



  • other ideas? 




  • ...and jump in!  


posted by Lisa Ballard @BasilLeaf



Teaching genetics through games


Let’s design a video game that simulates microevolution, and make it so infectiously fun that players are naturally compelled to understand the underlying genetic mechanims in order to succeed.
There have been lots of games that use principles of genetics and evolution, demanding varying degrees of understanding from the player - Creatures, SimLife, Rare Breeds, to name just a few. The latter enjoys in-browser approachability, but is limited in scope and depth. It would be great to come up with a system that allows (and encourages) users to come up with ways to peek under the hood and fiddle around. Can we motivate players to design genetics experiments within the environment of a game?




  • Jessica Polka jessica.polka (at) gmail


  • Meredith Carpenter (carpenter@berkeley.edu) 


  • Jun Yin, @junnibug 


  • Patrik D'haeseleer, patrikd@gmail
  • Ashish Mahabal, mahabal.ashish at gmail.com
  • Bala Ramamurthy, @rocket_bala
  • Lil Fritz-Laylin, fritzlaylin (at) gmail


  • ...and jump in!




  • We are working on this in the main room at the table under the sign with a pigeon on it! come join us....



Fancy Pigeons is a strategy game in which players must selectively breed a flock of pigeons to bypass a series of obstacles.  The object of the game is to get as many pigeons as possible through the course, with points awarded for each offspring that clears a challenge. Because players can see the queue of upcoming obstacles, they can choose breeding pairs which will produce offspring with both short- and long-term fitness. Mendelian genetics is faithfully represented, and in order to succeed, the player must maintain genetic variability in the population through heterozygosity while optimizing for a specific phenotype.

Game Play
Players are given charge of a flock of 24 pigeons. When the game begins, the pigeons ambiently mill about on the left side of the screen, pecking at the ground. On the right side of the screen, player see the obstacles: for example, a river, flaming bushes, and a fence. Instructional text appears on the bottom of the screen based on the obstacle at hand, ie “Breed pigeons which can cross the river with their heads above water.” By dragging and dropping two pigeons to a “mating box,” players populate a grid of 16 potential offspring, representing every possible genotype which can arise from the parental cross. The phenotype of the offspring is revealed, as well as their genotype. When satisfied, players can press “mate” to release the parents and 16 offspring back into the environment. After a 2 second delay (during which time clicking is disabled) 16 randomly selected birds flash and disappear so that the flock size is kept constant. A counter at the top of the screen keeps track of generation, decrementing each time a cross is made.

When this counter reaches 0, the pigeons attempt to bypass the next obstacle. Clicking is disabled during this time, and on average, 1 or 2 pigeons cross at once, so that players can observe their individual fates: birds that have phenotypes appropriate for the obstacles typically pass through, while those that do not typically die. Points are awarded for each successful passage, optionally modified by the difficulty of the obstacle. After all birds have traversed the obstacle, the generation counter resets, and players are once again free to breed.

Players lose when all birds die, and they win when the pigeons are finished crossing the final obstacle. The winning screen also displays the final point score, which can be posted to a leaderboard.

Deep river. Birds must be tall enough to cross without drowning. Selection: anything other than wild type.
Fire. Birds must be able to walk over the fire without burning their bodies. Selection: birds with long legs.
Fence. Birds must be small enough to fit through a hole. Selection: wild type.

A - dominant, wild type neck.
a - recessive, long neck.
B - dominant, wild type legs.
b - recessive, long legs.



 RF Modeling the Moon


(tags: physics, space)


As part of our lunar mission, we'll need to establish communication between our lander and the rover.  In order to validate the whatever communication scheme we plan to use, we need to not only understand, but have a solid working model of the RF environment.  Most lunar missions use remote sensing data, but as our communication link need only connect a short distance, the data available for current modeling methods will be too coarse for our needs. As the site has not been chosen for the landing (see above), we'll be working on methodology and approach to building this model.




  • Christie Dudley, @longobord ...and jump in below!


  • ...




  • ...



Planet Panel


…is an interface/tool that any citizen can use to understand the state of the world in regards to its resources, not just from a science perspective, such as climate change, carbon levels, food usage, etc. but also from a social science perspective; i.e. malaria outbreaks specific to region, ethnic groups, demographics, or population increases in certain geographies, migration of tribes, etc. The panel's objective is to show the state of the planet, to encourage young people to act responsibly and locally to create positive change.




  • Karen Lau, @k_lau


  • jump in!




  • ...



 Christmas tree encodes personal DNA


(tags: biology, physics, programming, art)


Why informative DNA sequences have to look boring? An Aduino board, color LEDs plus imagination could create a fun display, such as a Chrismas tree with lights encoding personal DNA sequences. The sequence information can be SNPs(Single Nucleotide Polymorphism) from a famous person like James Watson, Craig Ventor, or anyone who has done a SNP scan of his/her genome.




  • Dawei Lin, @iGenomics ...and jump in below!


  • Jun Yin, @junnibug 


  • ...




  • ...



Hack my Astrophoto


(tags: astronomy, programming, image recognition, space)


No, most space enthusiasts don't have a high-power telescope at their disposal, and even if they did, it's usually impossible to find the really cool objects (NEOs, distant galaxies, outer planets) without light pollution or megapixel limitations getting in the way.  But what if you could take a "decent" photo of a starfield from your backyard, run it through an online engine, and superimpose where all of the supercool targets really were.


This could be a great teaching tool for kids if they took a picture of their favorite star and learned the constellation it was a part of by superimposing those lines onto their own picture.


In the case of Near Earth Objects, superimposing predicted locations of NEOs would only be accurate to the best available data - if the object were actually photographed in a different place, the backyard astronomer could submit the photo to the scientists who track NEOs. This could help them make better predictions for Earth-impact events!




  • ... jump in below!


  • ...




  • Chris Gerty (NASA)



  • http://Astrometry.net  provides the "online engine" you describe above.  Works great with wide-field digital-camera snapshots.  Currently we only superimpose constellations, named bright stars, and NGC/IC objects.  Adding moving objects (time from EXIF headers?) from the JPL ephemerides would be cool.



NEO Prediction Engine and Asteroid Hunter (a.k.a. Chicken Little in Your Smartphone)


(tags: astronomy, programming, mobile devices, early warning, spectral analysis, space)


Using location data available on your cell phone, create a warning system for asteroids that would possibly enter the Earth’s atmosphere, a means for providing feedback for the event, and a collection of resources for the amateur asteroid hunter.




  • Lat/Long via Google Latitude, iPhone, etc.




  • Text message describing when an asteroid will enter the visible (from current location) atmosphere

    Expected magnitude of the expected event

    Where to look


Post-Event Inputs:


  • Images of event

    Video of event

    Lat/Long of photo/video source

    Spectral analysis of the event

    Points of contact to scientific community interested in fresh samples


Post-Event Outputs:


  • Database of photos/videos accessible to asteroid hunters around the globe

    Identification of meteor’s makeup using spectral analysis

    Triangulation of most likely location of meteorites from event




  • ... jump in below!


  • ...




  • Chris Gerty (NASA)



Make cool visualizations of Comet Holmes photos scraped from the web


We searched the web and found 1,300 photos of Comet Holmes, then fed them through http://Astrometry.net to get their locations on the sky.  Making a single plot showing all of them looks pretty cool (http://www.astro.princeton.edu/~dstn/temp/scihackday/holmes-7.jpg) -- imagine the awesome video rendering you could make!


Data at http://www.astro.princeton.edu/~dstn/temp/scihackday/

Some code at http://Astrometry.net/download  (but the plotting code ain't pretty yet)

Advice from code2 *at* astrometry.net (I can't make it to SF but will be checking email)




Scientists Family Tree


We plan to make a scientists family tree -- where scientists are linked to their advisors and who they have advised, thereby making a map of how scientists relate to one another. Three functions: (1) display map; (2) ability for people to add their advisor/advisee links; and (3) search for how two people relate to one another.


Data: scrapped from wikipedia to start with

Who: Lil, Richard, @wsm1





Comments (3)

Liam Holt said

at 4:27 pm on Oct 4, 2010

Identifying the mutations that drive adaptive change in directed evolution experiments

- We can force cells to evolve under extreme selective pressure on very rapid timescales (days - weeks).
- With the advent of massively parallel sequencing techniques we can resequence the genomes of evolved cells for a few hundred dollars
- This now allows us to use evolution as a discovery tool. But we need computational tools to cut quickly to hypotheses about which changes to the genes or genome may be responsible for the evolution we observe.

Project flows are required that integrate multiple disjointed bioinformatics tools and quickly parse gigabites of sequnce information to identify potentially interesting changes.

Examples of important information:

Where do we see chromosomal amplifications? Where exactly are the edges of these amplicons?
Can we identify point mutations, nucleotide changes, in the genome?
Are these point mutations within genes? If so, do they cause an amino acid change? If so, do we have any information about that sequence and its likely function before and after the change?
If the mutations are not within genes, do they affect known transcriptional, or translational control motifs? Do we create or destroy a transcription factor binding motif?

- Liam Holt, ljholt@gmail.com

Devin Lee Drew said

at 6:26 pm on Nov 13, 2010

Potential side project:

Anyone know some R? I know a bit, and I could use another brain to collaboratively implement a new multiple test p-value correction. R, Rserve, API docs and example R code on hand for the tweaking; the goal is integration via the Genedata Analyst API for (more easily visualizing) R script results.


Devin dLd-->pobox.com black t-shirt, glasses, tags biophysics, stats, bikes

thomas savarino said

at 12:20 pm on Oct 30, 2011

I just saw this while I was poking around. did you ever get the help you needed? if not, I'd be willing to help out

You don't have permission to comment on this page.