Return to Biostatistics Home Page Return to 1998 Newsletter Table of Contents

Interview with Cyrus Mehta

by Stephen Lake

Cyrus Mehta is co-founder and President of Cytel Software Corporation and Adjunct Associate Professor of Biostatistics at the Harvard School of Public Health.

SL: This is the ten-year anniversary of Cytel. How was the company initially founded?

CM: We founded the company in 1987 and I was the only employee. We thought that my partner, Nitin Patel, who was then in India, would shortly be joining me. I have a two-family house in Cambridge and I rented the first floor to Cytel.

SL: What was the state of exact inference prior to Cytel?

CM: The way we became involved with exact inference was quite interesting. In the late 70's and early 80's no one taught exact inference or used it other than for Fisher's exact test on a single 2 x 2 contingency table. When I came to Harvard as a postdoctoral research fellow in 1979, department members were working on problems such as computational problems for survival analysis and time-dependent covariates. We formed a Computational Statistics Interest Group, which planned to meet once a week to discuss new ideas. Dr. Zelen offered to give the first talk. In his talk he wrote out a generalization of Fisher's exact test on the blackboard. He said, "Everyone knows what Fisher's exact test is for the 2 x 2 table. But what about Fisher's exact test for the 2 x c table?" He laid out all the probabilities and said, "Maybe this is an interesting problem for the Computational Statistics Group."

I became interested in this problem because my background was in Operations Research. My partner, Nitin Patel, and I had both received our Ph.D.'s in Operations Research from MIT. We saw a connection between this problem and the network dynamic programming-type algorithms in OR. We were able to apply those methods and solve the 2 x c problem. We later generalized it to r x c tables and then found that this was an area where several of the OR ideas could be applied. We continued working on the development of computational methods for exact inference and in 1987 I decided that I wanted to found the company.

The real support all through this whole activity was Dr. Zelen. He encouraged my partner and me from the beginning and helped us in numerous ways.

SL: What was the reason for forming Cytel?

CM: There were three reasons for forming Cytel. One was that this type of research could not have a major impact without software. It was just too complex. We couldn't expect people to read theoretical papers on network algorithms and then convert them into software. In fact, when we published one of our early papers on stratified 2 x 2 tables in JASA, the editor Carl Morris thought that our paper was a very good piece of research but was uncertain how it could benefit the Practitioner. It was also unclear to him how researchers would actually be able to use it. We had no good answer other than we had FORTRAN programs which we were able to distribute. We realized, though, that many people wanted these exact tests, so we felt that making commercial-grade software would be a viable venture and one that would further our research. The second reason was that starting the company appealed to me. I could be my own boss and have my own company. The third reason was that the federal government offered the Small Business Innovation Research Program. If you had an idea for a commercial product which involved innovative research, you could apply for funding through this program. In 1987 we received the funding through this program.

SL: What are the products that Cytel currently offers?

CM: Our flagship product is StatXact which does exact inference for nonparametric and categorical data. The next level of complexity is LogXact which extends the permutation tests which are available in StatXact to discrete regression models. Currently it handles logistic regression, but we are extending it to include polytomous regression, Poisson regression and loglinear models.
We also have a different type of software called EaSt. EaSt stands for Early Stopping, and this software is for designing and performing interim monitoring on group sequential clinical trials.

We have recently acquired a package called EGRET which was developed at the University of Washington, Seattle. For a long time it was the standard package for epidemiologists. The product experienced trouble along the way because it couldn't be moved from DOS to Windows; when the platform changed it lagged behind. We were able to purchase the product and have been working on converting it to Windows. We will release the Windows version by the end of the year.

SL: Exact inference is fascinating in that an interdisciplinary approach is required in order to implement it. What are the principal stages involved in bringing an exact testing procedure from the drawing board to the professionals who will use it in practice?

CM: The research for the products which are in StatXact took place and was published about five years before the product came out.  We start with the research and try to solve the technical problems in a small team. The team consists of myself, Nitin Patel, Pralay Senchaudhuri and, sometimes, doctoral students. Pralay was a former doctoral student of mine. Other doctoral students who have worked with us are Karim Hirji, Joan Hilton, Steve Walsh and, currently, Chris Corcoran. It has been a joy and a wonderful learning experience to have worked with these students.  Once the research is done we submit it for publication. We don't have anything in StatXact which is not published in a quality journal. This gives it the seal of approval of the profession.  We then develop commercial grade code. The user interface is an enormous undertaking and the statistical part is a separate enormous undertaking. Finally, there are crucial decisions to be made concerning the data structures. In the testing phase we are fortunate to have many of our customers to volunteer to test the product.

I have been very lucky in terms of collaborators. In fact, I've had the best in the world: my partner Nitin Patel, my colleague Pralay Senchauduri, who collaborates on the algorithms and converts them into experimental code, and my colleague Yogesh Gajjar, who handles the systems end and keeps abreast of the technology which changes at a breathless pace.

 

I have been privileged to have had the opportunity to work with many brilliant faculty members at Harvard School of Public Health. For the EaSt software I was able to collaborate with Butch Tsiatis and Kyungmann Kim. In terms of the newer work we are doing on toxicology and carcinogenicity, I am very fortunate to have the opportunity to work with Louise Ryan and Paul Catalano. In the future I will be able to collaborate with L.J. Wei on his work in model checking and with Joseph Ibrahim on his work in missing data analysis.

SL: Where do you see Cytel when it comes time to celebrate the twentieth-year anniversary?

CM: I think we will stay very focused and will solve major technical problems such as we did in the first ten years. We solved the major technical problems in contingency tables and in the next ten years we'd like to solve the problems for discrete regression models.

I think that when the time comes to celebrate the twentieth anniversary our products will be available through numerous channels, not just through Cytel. Our goal is that these products will be integrated with all major statistical packages as well as with spreadsheet software like Excel. In addition, the software will, of course, be available directly from the internet. For us, connectivity and multi-platform compatibility are important since we develop highly specialized software for advanced types of statistical problems. That is why we worked hard during the first 10 years to be part of SAS and SPSS, where the basics are already provided. In the next ten years our goal is to be a household name with every producer and consumer of statistical products.

Cytel is having a 10th anniversary party at the Joint Statistical Meetings this August in Dallas. All friends of the Harvard Biostatistics Department are invited to join us.