大规模集群的硬件选择

Cloudera公司是商用的Hadoop支持提供商,是一个云服务提供者,下面是Cloudera的博客上的一篇文章,主要内容是为我们在大规模集群搭建中对硬件的选择提供一个指导。

原文地址:http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/

首先我们要清楚自己的应用,或者说集群中不同角色的机器,其性能瓶颈在哪里,我们需要考虑以下一些特性:磁盘容量、CPU、网络、内存容量。

下面是文中提到的不同机器配置,供不同需要使用:

  • Light Processing Configuration (1U/machine): Two quad core CPUs, 8GB memory, and 4 disk drives (1TB or 2TB). Note that CPU-intensive work such as natural language processing involves loading large models into RAM before processing data and should be configured with 2GB RAM/core instead of 1GB RAM/core.
  • Balanced Compute Configuration (1U/machine): Two quad core CPUs, 16 to 24GB memory, and 4 disk drives (1TB or 2TB) directly attached using the motherboard controller. These are often available as twins with two motherboards and 8 drives in a single 2U cabinet.
  • Storage Heavy Configuration (2U/machine): Two quad core CPUs, 16 to 24GB memory, and 12 disk drives (1TB or 2TB). The power consumption for this type of machine starts around ~200W in idle state and can go as high as ~350W when active.
  • Compute Intensive Configuration (2U/machine): Two quad core CPUs, 48-72GB memory, and 8 disk drives (1TB or 2TB). These are often used when a combination of large in-memory models and heavy reference data caching is required.

anyShare一切看了好文章不转的行为,都是耍流氓!
          

无觅相关文章插件,快速提升流量

分类 理论原地 · tag , ,