Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think the article said that you don't need to use Hadoop for everything and that it might be much faster to just use command line tools on a single computer. Of course you might find a use case where the total computing time is massive and in that case a cluster is better. I still don't think many use cases have that problem.

We are doing some simple statistics at work for much smaller data sizes and the computing time is usually around 10-100 ms so it could probably compute small batches at almost network speed.



Definitely. I was reacting to my parent poster, because size does not say everything. 1TB can be small, 1GB can be big - it depends on the amount of computation time that is necessary for whatever processing of the data you do.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: