citeseer |
(0) (0 Votes)
|
Views: (1011) Date: (08-04-09) Pages: () |
Abstract: The semiconductor industry is exploring various device and manufacturing techniques to continue scaling transistor sizes beyond the capabilities of CMOS. This scaling is desirable, as it helps reduce device power consumption and area, while allowing higher operating speeds. However, the scaled transistors are increasingly susceptible to manufacturing defects. Architectures that are built using these transistors will need to tolerate defect rates that are orders of magnitude higher than those found in current CMOS technologies. We previously demonstrated an approach that provides defect isolation in a network of a large number of simple self-assembled computational nodes. This scheme can handle up to 30 % defective nodes, but requires these limited size nodes to implement fail-stop behavior. In this paper, we explore trade-offs in implementing test mechanisms to achieve fail-stop behavior in nodes while meeting manufacturing constraints. We use hardware self-test mechanisms to verify critical node components, and software tests for noncritical components. We reuse test logic where possible and move non-critical verification to software to meet technological size constraints. The modularity of the node and test logic, and the ability to disable defective components enables the use of nodes with some (noncritical) defective components. This allows the system to tolerate higher transistor defect rates. In particular, if nodes with at least one communication unit and one compute unit, or two communication units are allowed to operate, we can tolerate a transistor defect probability of 1.5x10-4. This is an order of magnitude higher than the defect probability that can be tolerated when a single defective transistor results in an unusable node. 1