### Picking pieces of a jigsaw puzzle

October 22, 2007

Here’s a pretty cool brainteaser suggested by Pierre Dangauthier in a comment on a previous post of mine.

In a bag, there are all the pieces of a complete squared jigsaw puzzle of size $N \times N$. You are just allowed to pick $Q$ pieces, one at a time, without replacements. After having seen the $Q$ pieces, what is your estimate of $N$? With which confidence?

I don’t know the name of this problem. If you do know, please let me know.

### Picking marbles

October 14, 2007

Imagine that you are asked to pick marbles from a bag that contains 18 marbles, where 8 marbles are red, 4 are green and 6 are black. You are blindfolded and asked to draw a marble at a time (you are not allowed to put the marble back into the bag).

What is the minimum number of marbles that you would have to draw to be certain of having at least 3 marbles of the same color?

### How to organize scientific papers?

October 2, 2007

I have hundreds of scientific papers stored in my hard disk drive. I try to organize them by author, by field, or by research group. However, many scientific papers

• have more than one author
• are written by more than one research group
• encompass more than one research field.

For example, let us imagine that I have this neat, fictional paper on quantum computing. What if I archive it by:

• research group? Problem: the paper was written by two research groups: one at Stanford and another one at Caltech.
• author? Problem: the paper is written by three “heavy-weight” researchers. Which one to choose?
• research field? Problem: the paper is about quantum computing, which is a multi-disciplinary area. I could archive it in the Quantum Physics folder, or in the Computer Science folder. I could, of course, store it in the Quantum Computing folder, but that may be a bit too specific. What if I have a paper on quantum error-control that is closely related to this quantum computing paper? Shouldn’t these papers be archived in the same folder?

Many questions, few answers. How should I organize my papers so that I can find them in an efficient manner whenever I need them?

Some months ago, I came across an interesting discussion on Nuclear Phynance about this topic. Some people on NP were using iTunes to organize their papers (no kidding! check this out) or some general-purpose document management systems. Nevertheless, these applications were not specifically tailored to manage scientific papers. No general-purpose application can perform well under all possible scenarios. I thus would love to have an application specifically designed to manage scientific papers.

What features should that application have? Well, it should work a bit like iTunes: in an easy, intuitive and efficient manner. One could have all papers stored in one folder. Or, if you prefer, in several folders: a folder for all 2007 papers, another one for all 2006 papers and so on (for example). Then, that application should allow one to add papers to the library (just like what happens in iTunes). It should allow one to specify several different fields for each paper, such as:

• type: is it a conference paper, a journal paper, a technical report?
• event: if it is a conference paper, then one should be able to specify which conference was that.
• title: this one is pretty self-explanatory.
• authors: one should be able to input a list of all the authors.
• keywords: instead of categorizing a given paper in a rigid manner, a fuzzy approach would be better. A given paper does not need to be a math paper or a physics paper, it can be both!
• abstract: this could be a logical extension to the “keywords” field.
• bibliography: this might seem irrelevant, but imagine that you could specify which other papers a given paper refers to! Even better: imagine that the application would immediately scan your library to try to find those papers mentioned in the bibliography!

The problem is: I have many hundreds (if not a few thousands) of papers. Specifying the aforementioned fields for each and every paper in my hard disk drive would be prohibitively time consuming. However, what if a new file format were created? That new format could encapsulate the file (in PDF or PS format, for instance) and all the metadata! To make things simple:

1. the paper’s authors would input all the metadata at once, and then publish the files on their webpages.