Wednesday, May 3, 2017

More Tips from Charles Leiserson to Write Scientific Papers

I learned more from Charles Leiserson about writing scientific papers and doing presentations. The focus here is on diagrams, figures, codes, and other artifacts.
Here are a summary of my observations from his today's workshop:

  • Shade foreground in diagrams with some light color. This directs attention to the right place. Sometimes (rarely) use shadows in diagrams (shadows are good for PowerPoint slides). This helps in terms of "emotional impact" on reviewers (e.g., no typos, not polished, etc.)
  • Avoid clutter on axis (e.g., instead of having multiple "*10×6", replace with "in millions").
  • Place data description close to data (e.g., replace legends with labels on the side of plotted function). 
  • Try to change diagrammed functions in a way to have less compaction and make them more readable (e.g., normalizing them). 
  • It is good to have a summary plot in the introduction and add breakdowns (with more details) later.
  • Have full sentences in the caption of figures (it is good to have a phrase at the beginning and continue with full sentences). The phrase is semantic and the following sentences are syntactic aspects about figure. In the body of the text, you do not talk about syntactic aspects. Similarly, arguments that belong to the text should not be in functions. Don't be repetitive.
    One other way is to say: "X is better than Y" (semantic part, the takeaway, the message, ...), followed by syntax description. 
  • Colors: they should enhance the figure; yet, the figure should be understandable for people who are color-blind or have a printed copy in black and white.
  • Be careful when presenting a diagram in terms of a continuous curve; it might raise questions when your data points are discrete (in that case, make sure the data points are highlighted using 'dots' in the diagram). Also, consider using bar-charts when applicable (sometimes continuity should be highlighted with using curves instead of charts).
  • You should always have one exhibit in the introduction as an executive summary (a diagram, a figure, a table, etc.). Work-examples are good also for Introduction to motivate.
  • Tables: generally, avoid vertical line and horizontal lines (unless needed, e.g., the first row). First column is left-justified and others should be center-justified. Consider using "siunitx LaTex package".  Headers should be italic and body in roman.  Make sure that the semantic is included at the beginning. 
  • Captions should include a small semantic sentence/phrase, followed by syntactic sentences. It should not include more semantic descriptions/conclusions. They go to the text. Syntactic sentences should be precise. 
  • No colons in the first column of table; in slowdowns, don't use 'x', remove 'test name', transpose tables in a way numbers that you compare are vertically aligned: stuff in the same column should be comparable (as opposed to rows). Include more data in necessary (e.g., include both running time and slowdown, it is better to have division of two rows if required (e.g., slowdown).
  • It is good to downsize caption compared to body text. Size of the caption should be uniform with the text in figures. There are exceptions, e.g., axis labels. This requires avoiding shrinking of figures -which makes figure labels smaller-.
  • In your figures, use alignment (e.g., in PowerPoint) so that figures are symmetric whenever required (e.g., boxes should have the same size and their edges should be aligned in a figure). 
  •  Be careful when using examples, e.g., when drawing a triangle, if you draw it with equal sizes, the reader easily generalizes it to all triangles in the text (while you might mean a general triangle).
  • For graphs, consider placing arrows on the middle of the edges instead of endpoints (in order to avoid congestion on vertices). This is done in "Combinatorial Mathematics".
  • Always use a face in a diagram if possible (e.g., a student in a diagram can be presented by a face of body). This will be more compelling for the reader. This point applies definitely to presentations and, to some extent, to papers. 
  • It is not bad to be entertaining when writing your paper. It is OK, as long as it is not distracting. 
  • Use Inconsolata font for including code and other situations which require  fixed-width font in your paper. It is also good to put code in a box which is highlighted (with a little triangle on the bottom-right, which gives a feeling that the code is inside a sheet).  Also, in case of a code, colorize it (see this paper, for example). Avoid using black font in the code; replace it dark-dark brown, and apply the same when using words from the code in the text (e.g., "we do X to implement 'for' loops").
  • In papers, consider numbering code-lines globally. 
  • Avoid possessive terms such "our software", "our result", etc. It sets up a dynamic with the reader which is not very nice.  Do not have too much dependency on using "we" and "our" in general.


Wednesday, April 26, 2017

Charles Leiserson Tips for Technical Writing

Here are some useful tips about technical writings that I learned from Charles in his advanced performance-engineering class:

  • Avoid word New in the title (it will not be new 20 years from now). The same problem about novel (less serious).
  • Avoid needless words in the title ("building a").
  • avoid week words such "is" "to be" and "have". e.g., replace "they are unable to" with "they largely fail".
  • Always embed "however" in the middle of sentence. It is not the same for "But". In that case, use short sentences (break "xxx , but" with "xxx. But," ). 
  • Verb is the most important part of the sentence. After that, the subject is the second important. So, never restrict a subject to a "This". Add a noun after "This". Also, avoid "it", e.g., avoid "it is clear that".
  • Related to the previous point, replace instances of "is focused on" with "focuses on" (a more powerful verb).
  • Avoid semicolons as much as possible.
  • It is OK to have passive verbs, however, only when it is required. 
  • If you are using "while" as a connector, replace it with "whereas".
  • Replace instances of "very" with "damn" and then remove "damn" (inspired by Mark Twain diary). For example, replace "very good" with "excellent".
  • Be careful about hyphenating. "cloud-computing services" is different from "cloud computing-services". No need to hyphenation of "quickly growing function" because "quickly is clearly adverb (do not hyphenation adverbs ending with "ly").
  • Use "Performance-Engineering" instead of "Performance Engineering" because it the subject. In this example, "Performance-engineering" and "Performance-Engineering" are both fine. In similar situations, always make "Is" capitalized because it is a verb,
  • Replace "There are a number of " with "There are several" or "There are many". Then geAlso, "there are has previous problems.
  • Use present tense. Avoid future tense and, even worse, "would" "could", etc. It is fine to use past tenses depending on the context.
  • Make a context first in long sentences: "when attempting XX, teachers face YYY" is better than "Teachers face YY when teaching XX". 
  • It is OK to Highlight a term in both abstract and body
  • "IDs" vs "ID's". Acronym to be plural do not need apostrophe in modern English. Charles is OK with the old style, however. 
  • "a, b, and c" is preferred to "a,b and c".
  • This is about using digits vs spelling of numbers. Knuth says if the number is less than 10/12, then you spell it out; otherwise, use digits. However, it is a bad rule for technical writing. Let's do the following: If you do math on numbers, use digits, otherwise, follow the other rule. 
  • Be cautious in using "we", since it can be all authors or authors plus readers. The distinction should be clear. 
  • The first sentence of the paper should be about contributions rather than describing. 
Here is an example:

Performance-Engineering of CilkSan
Compressed Dictionaries for Optimizing the Shadow Memory

This paper explores new solutions for maintaining the shadow memory of CilkSan, a debugging tool that detects determinacy races in programs written in Cilk. At the heart of CilkSan lies the SP-bags algorithm, which is a provably good, efficient race-detection algorithm. The SP-bags algorithm maintains a shadow memory data structure that stores the IDs of the previous reader/writer procedures for each memory location. Unfortunately, the large size of the shadow memory is a bottleneck for detecting races for programs with high memory demand.

We tend to decrease the size of the shadow memory by applying compression techniques. We introduce the concept of Compressed Dictionary as an abstraction of a shadow memory with high spatial locality. This abstract data-type is expected to have applications beyond CilkSan. We employ different techniques such as run-length encoding, Lempel Ziv, and Burrows-Wheeler encoding to implement a compressed data structure. Using these solutions, one expects the size of the shadow memory to be considerably smaller than the memory requirement of the current serial program. As a result, we can determine races in real-world applications in which the current version of CilkSan fails due to excessive memory usage.



Saturday, April 22, 2017

Postdoc Improv Class

I attended an Improv class arranged for MIT postdocs. It was different (much better) that the class that I described here.  It was a two-hour class in which three instructors from Improv-Boston introduced around 20 postdocs with the concept.
We started with a counting down exercise, where we counted from 8 to 1 then 7 to 1, etc. For each count, you have to move one of your hands/legs. This is to help breathing and getting comfortable about the situation.

The first task was to 'imagine an extra-human power for yourself and pick a name for yourself based on that, and then act that name'. For example, my extra-human power was to 'be able to impeach any president', my name was 'Shahin-the-impeacher', and the action was just 'playing throwing off something'. 

The first real improv excessive was 'list-5-things' where you asked a random participant a random question, and they had to answer with listing 5 items. For example, I was asked about 'my sports that I like to play' and after answering with a list of 5 (after listing each item, others cheered), I asked another postdoc 'list 5 animals that you want to reincarnate to'. There is an element of surprise, and you have to answer without knowing the questions.

Another exercise was about 'gift-presenting'. Two participant attended, the first one pretended to give the other a gift. Based on the way he/she acted, the second one should improv what the gift was and thank him/her for the gift. For example, I can pretend to give you a big, heavy bag, and you can thank me for giving you a bag of potato. Again, there should be an improvisation on what the gift is, and how to handle the question that follows it: 'how did you know that I need a bag of potato'?
 
For the next exercise, we were partitioned into groups of around 10 people. Each group had to advertise a product. The instructors throw random questions at each member. For example, our product was inspired by the problem of 'people not cleaning after their pet'. The first question was 'hat is the name of your product?', followed by 'how does it work?', 'where is your market?', 'what is your moto?', and so on. I was asked 'which celebrity uses your product?' and my answer was 'Peter Griffin'. Finally, we had to advertise our product with an improv act. One of our group members played the role of a dog, one was dog-walker, and I was the narrator. 

The next task was a random improv dialogue. I found this one the most interesting. Two people started a conversation with short sentences. The conversation followed by each person repeating the sentence of the other person, and adding 'And ...' where their sentence followed. For example, you could say 'I love Boston' and I could answer 'Yes, you love Boston, and I think the weather could be better in Boston'. Here, I am showing a sort of disagreement, but I do not use 'But'. This seems to be critical in continuing an improv dialogue.

In the final task, we were given a page where we wrote a problem on one side and the name of a random object on the other side. In pairs, we you to use improv to solve the problem of other person with your own object. For example, I had to provide a solution for 'not having enough time' with 'a pair of glasses'. It was quite challenging, and I think I was not creative enough in providing a solution.

I found the class very interesting and helpful. I am considering to register for the next class that starts in a few weeks, and ends just before my departure from Boston.

Wednesday, April 19, 2017

Sliceform Studio

I am sitting in the Eric Demaine's class on Geometric Folding. This course is not a typical Computer Science course that I enjoy. However, occasionally, it involves cool, interesting topics and tools.
Last week, I learned about Sliceform Studio, developed by former MIT student  Yongquan ‘YQ’ Lu under the guidance of Eric.

The idea is to use geometry to understand/reproduce the beautiful designs often found in Islamic (particularly Iranian) tiles and fabrics. I enjoyed a lot playing with this tool. The code is open source and one can play with it.
I discussed with Eric the idea of making a font with this tool. I believe one can make a beautiful font with an oriental look with Sliceform Studio. More details about the tool can be found in its website.
An example of the design that it generates is followed.




Thursday, April 6, 2017

MIT EECS Postdoc Visiting Committee / postdoc issues

Since last Fall, I have been a member of the postdoc visiting committee, which is aimed to get some feedback about postdoc lives and their concerns/issues at MIT.
In the course of a few months, we collected information/surveys from postdoc. Rabia Yazicigil was managing our efforts and presented the result last Tuesday.
We were reporting to a group of MIT faculty/alumni, who seemed very willing to help postdoc lives. I think the postdoc management and leadership workshop that I attended on January 2016 was inspired by the last visiting committee report.
The main concerns that postdoc at MIT have can be summarized as follows (from my point of view):

  • A sense of ``belonging'' to MIT. Postdocs spend relatively short period of time in the institute, and at a relatively older age. It is harder for them to ``connect'' to an environment which already ``belongs'' to students and other staff. Postdocs are mostly occupied with doing research and planning for the future, and a result of this pressure, have little time to participate in activities which help them make a community and adapt to the new environment. Note that this issue is inherent in postdoc structure. MIT has done a good job to address this problem. But still it seems to bother many postdocs.
  • MIT has two concepts of ``postdoc-fellow'' and ``postdoc-associate''. I am both because I receive money from both MIT and outside MIT (NSERC). But if you want to ``partition'' postdocs into the two categories, I will be a fellow because most money that I receive comes from outside. Now, there are distinctions in benefits for these groups. In particular, me and other fellows do not receive any health benefits. A postdoc fellow pays around 300-400$ per month for health-related insurances, which I found astonishing. Health insurance has been the worst memory that I have from MIT (and arguably this country).
  • Postdoc Leadership Workshop has been great in helping postdocs to from a community in which they enhance their leadership skills in Endicott House, which is located outside Boston. This workshop takes two days, and many postdocs cannot attend it because they have a family. This reveals one of the main differences between a typical postdoc and a typical student. I think MIT is planning to hold similar workshops (but one-day workshops) in the near future. 
I have found the idea of Visiting Committees very exciting for understanding and solving the issues that students, postdocs, and other staff experience in an academic environment.  You can learn about MIT visiting committees here.


Saturday, April 1, 2017

My Supervisor, Alex ...

On March 12th, just after a Sunday hike in Blue Hill Reservation in outskirts of Boston, I received three emails from Waterloo. Ian Munoro, senior professor in Algorithms group in Waterloo, as well as Khuzaima Daudjee, in Database group, and Daniela, a friend, informed me that my PhD supervisor, Alejandro Lopez Ortiz, had died earlier in that day. I had visited Alex last time in February, where I found him very thin and tired. He had lost almost half of his body weight, and looked very exhausted. There was no further treatment, and he was just waiting for his time to go.
In our last meeting, I took my PhD degree to the hospital and asked him to sign it. Instead, he wrote a long note on the back of its cover. It remains very dear to me. That meeting happened in Waterloo general hospital cafe, where Alex's two kids, his father, and Daniela were present
I attended the memorial for Alex on Saturday, March 18th, just before the Spring and the Iranian new year. A music was being played, which was selected by Alex, and we were given a sheet of paper with a poem that he chose for this occasion. I realized that he has done many of the arrangements, as this sad occasion was expected.

For me, Alex was a fun friend and boss,  who taught me a lot about research, and about life. He helped me a lot in all stages of our collaboration, from the moment that I met him for the first time seeking for a professor who helps me switch adviser, to the moment that I was negotiating my job offer with University of Manitoba. I remember the time that I received an email from Dr. David Johnson, my PhD external committee member whom I wanted to work with as a postdoc, informing me that we probably cannot work together since he had cancer. When I told Alex that David has cancer, Alex became very sad and shocked.  That happened a bit before Alex himself was diagnosed with a much worse cancer. I remember the day that he was diagnosed, and how gloomy and shocked all the Algorithms lab was. I left Waterloo for MIT a month after. Alex helped me a lot on this path, and later I realized that David has also recommended me for that position. I miss both of them. David passed away a few months earlier than Alex. I lost two great mentors in a few months. I cannot stop thinking about Alex. He will be dearly missed.

p.s. here and here are two pages on Alex's memory.

p.s. the first picture is from the day of my defense. From left to right: Ian, me, Alex, Jonothon Buss, David, and Jochen Koenmann. The second picture is from the day of my convocation with Alex.


                                                       

Saturday, March 25, 2017

A One-Day Course by Edward Tufte at CSAIL

I was fortunate to be supported by my adviser at MIT, Charles E. Leiserson, to attend a one-day course by Edward Tufte on March 17th, 2017. Here is some info about the course on Tufte's website. MIT CSAIL organized the event and made it possible for many students and postdocs to attend it. It took place on a big conference room in Mariot Cambridge, which is quite close to Stata Center. Because of the bad weather of a few days earlier, two classes were combined and there was a big audience. As a result, there were two screens in the big room, and Edward seemed to have difficulty switching/focusing on one. I have learned from another workshop (see here) that two screens for a talk is simply a bad idea, which is now rare compared to 10 years ago.

At the beginning, we were given Tufte books which are centered on "how to present data and information". In general, the course was about selected topics from these books. I believe if someone has the books, there is little reason to attend the expensive class. We were asked to show up one hour before the course and were given instruction to read particular parts of the book 'carefully'. I thought it is necessary for understanding the material and following the course. Unfortunately, later we learned that it is just to illustrate Tufte's approach to teaching, where he presents students of his classes with some learning material before beginning a class. In the case of this course, that data was never referred to.

The course began with darkening the room followed by a piece of piano with an animated graphical score as a way to 'present the underlying data', something like this. The course continues with a review of web-page design and high-resolution screens, etc. During the course, the room lighting were adjusted multiple times. Sometimes we were in complete darkness, which made it hard to take notes on paper (and awkward to do so on laptop as it seemed too bright). We were barred from taking videos. 

One of the things that I remember (and do not necessarily agree) from the course is that Tufte objects the idea of presenting little material in order to effectively teach it. He believes a lot of data can be presented, e.g., in the same figure and the reader can perfectly digest them. I agree with this in many cases but not always. Sometimes the extra details just becomes confusing. Quite related to this, we learn that Tufte prefers high-density data display that conveys a lot of information. As a result, he does not like PowerPoint or slides since they tend to break data into small portions.   

One interesting point that Tufte makes is that, data explanation (e.g., what a color or bar means) should be close to the data diagrams (those colors or bars). You should not add an explanation on a corner of a figure. Just added in exact place that is required.

I also learned about sparklines, which are introduced by Tufte in 1980s. Here is the wiki page about them. I find them useful in presenting high-density data, and I believe they are required but not-present in many Computer Science research papers.

One nice thing that I remember from the course was a 19-th century diagram about loses of French Army in Napoleon's Russian invasion. Here is a wiki page about it. It is a good example on how data can be a 'beautiful evidence' (title of Tufte's book which include this topic). 

I find this review about Tufte's course quite interesting. I agree that Tufte rambles a lot, talks too much about his books and himself, e.g., when he talks about his experience with NASA, power-points, or his personal dismissal of big data. He looked like an arrogant person to me, specially when I saw how he made a long dialogue with a colleague while there was a big line of people waiting for him to sign the books (I wanted to have his signature on the books; but I changed my mind after that).

According to my experience, which is shared with a few others who wrote reviews about the course, Tufte's books are better than his class, and arguably better than his manner.