龙芯开源社区

 找回密码
 注册新用户(newuser)
楼主: joki

版主,龙芯3能够兼容SSE扩展指令集么???

[复制链接]
发表于 2007-10-6 22:38:05 | 显示全部楼层
原帖由 quene 于 2007-10-6 10:15 PM 发表
See MIPS Run  2nd edition,  last passage of Section 1.4.11 on
Intrinsity/FastMath 2G MIPS processor

[url]http://www.en-genius.net/site/zones/networkZONE/product_reviews/netp_050602[/url]
发表于 2007-10-6 22:40:03 | 显示全部楼层
原帖由 胶林探索 于 2007-10-6 10:38 PM 发表

http://www.en-genius.net/site/zones/networkZONE/product_reviews/netp_050602

networkZONE Products for the week of May 6, 2002 Intrinsity Says . . .
Intrinsity's 2-GHz FastMATH AdaptiveSignal Processor and FastMIPS CPU Provide ASIC-Level Computing withProgramming Advantages of Embedded Microprocessors

Intrinsityhas introduced its FastMATH microprocessor, a multi-GHz Adaptive SignalProcessor chip that delivers unsurpassed programmable performance onreal-time, adaptive signal processing applications. Intrinsity'sFastMATH microprocessor combines an innovative MIPS-based architecturewith 2 GHz speeds, delivering unprecedented real-time signal-processingperformance in applications that would otherwise require banks of DSPs,expensive FPGAs or power-hungry desktop CPUs. Unlike those products orexotic alternatives, the FastMATH microprocessor is fully-programmable,scaleable and uses industry-standard development tools.

The price-performance and programming ease of Intrinsity's FastMATHmicroprocessor improves the economics and time-to-market schedules ofmany application areas, including medical imaging, military systems,network infrastructure and mobile data communications. Designed toscale beyond 4 GHz, the FastMATH processor delivers six times theperformance of the fastest DSPs on common, math-intensive operations,such as the Fast Fourier Transform (FFT) algorithm. Adaptive signalprocessing algorithms are even more math-intensive and must quicklycompute new system parameters based on multiple high-speed data streams.
"DSPs, FPGAs and desktop processors are clearly not meeting all theneeds of today's high-performance adaptive signal processingapplications. Intrinsity is utilizing its patented design technology,Fast14 Technology, to create very high performance processors that willhelp our customers accelerate the wide-spread adoption of theseapplications," stated Paul Nixon, CEO, president and co-founder ofIntrinsity. "Our FastMATH Adaptive Signal Processor product can providethe computational speed of ASICs, while delivering the well-proven costand time-to-market benefits of industry-standard RISC microprocessors."
Adaptive signal processing systems typically consist of arrays ofchanging data that require complex processing to deliver real-timeperformance in response to changing conditions. Broadly-definedexamples include picking signals out of background noise, canceling outinterference from multiple sources and dynamically altering systems torespond to inputs from sensors or data streams. A large number ofwell-established adaptive signal processing algorithms already demandeither higher computation rates or lower-cost implementations - or both.
2 GHz Matrix and Parallel Vector Math Unit - Provides exceptionalparallel data computational performance on the commonly-used matrix andvector math data types found in adaptive algorithms.2 GHz MIPS32Architecture-Based Processing Core - Provides the ease of programmingand flexibility to address changing algorithms and standards.
High-Speed I/O - Allows complex, adaptive algorithms to bepartitioned cost-effectively across multiple FastMATH processors byproviding dual RapidIO ports. For the first time, these elements arecombined to form an Adaptive Signal Processor chip capable ofdelivering unprecedented programmable performance in real-time signalprocessing applications. This technology is especially valuable todesigners of wireless systems as they innovate new ways to extend thecapacity of cell towers in wireless systems: "As more users crowd theusable radio spectrum and demand higher data rates, more complexprocessing is required to make efficient use of this limited resource,"stated Dr. Jim Gunn, Forward Concepts Senior consultant. "Byaccelerating adaptive signal processing algorithms with their multi-GHzFastMATH processor, Intrinsity offers technology that can enablewireless systems to make more efficient use of the limited bandwidthavailable for each cell site."
Intrinsity also announced the 2 GHz FastMIPS high-performanceMIPS-based embedded processor. Delivered from a standard 0.13 µmfoundry process and scaleable to 4 GHz, the FastMIPS product benefitsfrom Intrinsity's Fast14 technology to allow multi-GHz performancewithout exotic manufacturing techniques. Targeted at high-performanceembedded applications, the FastMIPS product also includesdual-RapidIOports to enable balanced system performance.
Intrinsity is a MIPS architecture licensee, and the FastMATH andFastMIPS processors are based on the MIPS32 ISA. The selection of theMIPS architecture makes designs easy-to-implement, allowing customersto leverage best-of-breed design tools from suppliers such as Corelis,Green Hills Software, HelloSoft, OSE Systems and WindRiver.
"We're pleased that Intrinsity has chosen the MIPS architecture asthe basis for its adaptive signal processing strategy," said JohnBourgoin, chairman and CEO of MIPS Technologies. "The replacement ofhard-wired logic and special-purpose processors by high-performance,general-purpose processors has great benefit to SOC designers and is atrend we expect to see increase in the future. "Intrinsity will beginsampling its FastMATH and FastMIPS processors in 4Q 2002. Additionaldetails will be disclosed at the Embedded Processor Forum that beginson April 29, 2002. Complete product specifications are available withthe completion of a non-disclosure agreement.

EN-Genius Says . . .
Talk about cramming 10 pounds of expectationsin a 1-pound can! Intrinsity certainly seems to have done this whenthey announced their chip that combines a 2+ GHz embedded DSP, and itscompanion RISC controller a couple of weeks ago. They have ambitions todeliver a serious challenge to TI's and ADI's lock on high-performancesignal processing in the wireless arena by offering products that offerbetter algorithmic support, as well as run a whole lot faster - infact, many times faster than any existing signal processing chip.
Claims like those being made by Intrinsitywould be hard to take seriously, except that they have working testchips (not the whole processor) that at least validate their "Fast14"design technology. The Fast14 transistor structure allowed a test chipmade in standard 0.18-micron CMOS to run at 2.2 GHz. This gives mereason to at least hope they will deliver on their promise to haveearly silicon some time in Q4 '02.
To clarify a bit, their Fast14 technologyconsists of several elements. The first part is a proprietarytransistor geometry and layout that they claim gives a 3-5X speedimprovement over competing design methods - (Reality check: AnalystLinley Gwynapp says it will deliver at least 2X.) Fast14 also employsdynamic logic to achieve these speeds, plus a new clocking style thatuses 4 overlapping clock phases per cycle. They say that this avoidsthe race conditions that plague higher speed dynamic logic.
While dynamic logic has been an academiccuriosity for several decades, it is much more difficult to design withthan static logic. That is, of course, unless you develop your ownspecial design EDA tools to account for its peculiarities, somethingwhich Intrinsity has done. Among other niceties, the tool features"noise-aware routing" which performs trace placement that anticipateswhere crosstalk problems are likely to occur, and avoids them.
Tools like this allowed them to develop a newlogic family that they call NDL (one of N dynamic logic.) The logicfunctions, which operate on 2 bits of information at a time, requirefewer gate delays for a given function than standard binary logic. Inaddition to enabling rip-snorting processing capacity and curingmale-pattern baldness, Intrinsity also claims that their technology canbe coaxed to consume less power than competing dynamic logicimplementations from Intel.
The first fruits of their efforts will be apair of very fast processors sharing a single piece of silicon. Runningat 2 GHz, the chip envisioned by Intrinsity will contain both aMIPS-like RISC CPU and powerful matrix math unit. Both are based on theMIPS architecture, but sport different architectural modes to make themmore suited to their particular tasks.
Their FastMIPS CPU is an extremely hopped-upversion of the venerable MIPS processor. While it is quite capable ofall the standard control and compute functions performed by its slowercousins, it's primary job is to drive Intrinsity's FastMATH unit, amath engine that is optimized for matrix and vector calculations.
The FastMATH core programs a lot like aconventional external floating point unit (FPU) math co-processor usedby MIPS machines in other applications. Data to be crunched is fed tothe math engine via a 1-Meg L2 cache. An on-chip 2 port RapidIOinterface connects the cache to the host, or to other Fast Mathprocessors. An SDRAM controller and JTAG interface connect to theoutside memory (up to a Gigabyte) to provide the math processor with aplace to store variables and program data. Programming is downloadedfrom the FastMIPS via a MIPS standard coprocessor interface. Hooked intandem, the awesome twosome runs TI benchmark FFTs 6X faster than TI'sfastest 600 MHz 'C6x.
And, should you have really outrageous numbercrunching tasks to do, you can drive multiple FastMATH chips using asingle FastMIPS processor.
发表于 2007-10-6 22:42:43 | 显示全部楼层
When I asked what made Intrinsity different from the re-configurable processors offered by Chameleon and QuickSilver, they explained that it is fully programmable, rather than configurable. Although I'm somewhat algorithmically-challenged, their claim seems reasonable to me and that this would make it easier to easier implement new signal-processing schemes, especially ones that are adaptive in nature.

Today, adaptive algorithms are usually run on large arrays of DSPs in high-value signal-processing applications such as real-time MRI- and CAT-scan image analysis. Intrinsity is betting that if they could get the cost down, similar algorithms could be used to boost the capacity and quality of mobile/wireless communications. At least in theory, using adaptive processing to drive smart antennas and multi-user detectors can improve throughput by up to 3X.

These sorts of signal-processing tasks require fast matrix and vector operation capabilities to help them adapt quickly to fast-changing signals. Even Motorola's AltiVEC ,a very powerful and well-conceived hybrid processor, does not do these matrix operations. These capabilities could help wireless systems designers realize the long-promised "software radio" that can redirect its computing efforts to handle new or evolving standards in ways that ASIC- or FPGA-based systems won't.

Speaking of software, Intrinsity made a good decision in using the MIPS architecture as the basis for its product instead of something totally original. Basing their machine around a well-supported instruction set means you won't have to program this beastie with a handful of half-baked custom tools from the manufacturer. Instead, you'll have access to tons of software and software support from Wind River, OSE, Green Hills, MatLab, HelloSoft, and Corelis.

If Intrinsity hits its ambitious 4Q '02 sampling date with working silicon, I expect that it will find a home in places like wireless base stations - especially multi-protocol systems with extended capacity technologies. Should the 3G recover from its current stalled market conditions, this would be a good candidate chip for building base stations that could track the evolving standards and allow almost instant upgrades as new features, protocols, and maybe even modulation schemes arise.

With all the renewed interest in intelligence work, I also expect that it will find some military and "black" apps as well, in areas such as SIGINT and radar analysis.

As with anything that pushes several edges of the envelope at once, I don't consider it a sure bet that Intrinsity will be able to deliver on all its promises - even with the successful test silicon, and the six-month lead time from their announcement. Nevertheless, their validation of the technology with test chips and their use of the solid MIPS architecture keeps my vapor index rating within (just barely) the credible range. I'm rooting for the chip though, and look forward to hearing about working samples some time before the snow flies this winter.
 楼主| 发表于 2007-10-7 11:09:21 | 显示全部楼层
原帖由 AFXIF 于 2007-10-6 09:23 PM 发表
龙芯团队很可能有过可以用固件二进制翻译X86、Power指令集的VLIW处理器的研发,时间差不多就是香港文传收购全美达生产线的时候。
我估计是因为当时认为可以借助关系较好的香港去获得全美达那套指令集的授权,实 ...


中科院不是有个专门研究二进制翻译的DBT小组么?写了那么多mips to x86的论文。专利都注册一大把了怎么可能连点技术储备都没有呢,也该向产品转换了。
收购全美达是传言吧?当初采访李国杰的文章里就介绍过这个(具体哪篇记不清了)。据说设计龙芯2的时候就曾考虑过运用类似全美达的技术方案。可是后来从一家香港公司(大概就是香港文传)了解到其表现不太好——“有些软件跑得还可以,有的就非常差,这项技术还不成熟。”所以,二进制翻译技术在龙芯2的时候就暂时没有采用。再说,与其购买因性能不济而歇菜的全美达二进制翻译技术还不如买它的的低功耗技术来得实用啊!把失败的低性能技术copy一遍就能成功么?它那套x86及其扩展指令集是软件模拟的,也谈不上授权。除非你指的是硬件VLIW指令集及那个二进制翻译翻译软件的全套技术,可这个能直接照搬到多核心MIPS处理器上?而且还要改造得性能超高,这个难度不等于重头来过么。别人嚼剰的甘蔗渣又能榨出多少水来?疑惑……
终归还是要靠自己,期待龙芯3能够取得突破。
另外,你所说的4核强核心设计倒是头一回听说。如果这是真的,说明当初是打算在核心架构上做比较大的改动了。只是现在调整为运用软硬结合的技术,所以又转向简单多核+软件优化的道路了?

[ 本帖最后由 joki 于 2007-10-7 11:25 AM 编辑 ]
 楼主| 发表于 2007-10-7 11:21:50 | 显示全部楼层
原帖由 胶林探索 于 2007-10-6 09:31 PM 发表

这个性能指标上要比龙芯2快吧?


也许吧,山外有山……
可是,应用不足的话,就算是2G主频又有何用?推广难啊
发表于 2007-10-7 12:48:43 | 显示全部楼层
要的就是那块固件,拾人牙慧也没什么不好,没有积累就没有提高。
还有,应该当时是准备依靠龙芯积累的经验重新设计一款VLIW处理器的,那款处理器并不是龙3,甚至可能不叫龙芯。
发表于 2007-10-7 15:32:18 | 显示全部楼层
原帖由 joki 于 2007-10-7 11:09 AM 发表


中科院不是有个专门研究二进制翻译的DBT小组么?写了那么多mips to x86的论文。专利都注册一大把了怎么可能连点技术储备都没有呢,也该向产品转换了。
收购全美达是传言吧?当初采访李国杰的文章里就介绍过 ...

以前听wwa说那个DBT效率相当低,也许是因为这个没向产品转换吧。
 楼主| 发表于 2007-10-7 21:23:34 | 显示全部楼层
原帖由 AFXIF 于 2007-10-7 12:48 PM 发表
要的就是那块固件,拾人牙慧也没什么不好,没有积累就没有提高。
还有,应该当时是准备依靠龙芯积累的经验重新设计一款VLIW处理器的,那款处理器并不是龙3,甚至可能不叫龙芯。


呵呵,也许吧。不过那就不是MIPS处理器了。再说,额外加块固件的话处理器设计会不会过于复杂?
资源和精力富余的话,另起炉灶也是个不错的选择,两条腿走路嘛。

[ 本帖最后由 joki 于 2007-10-7 09:34 PM 编辑 ]
 楼主| 发表于 2007-10-7 21:30:17 | 显示全部楼层
原帖由 FFFM 于 2007-10-7 03:32 PM 发表

以前听wwa说那个DBT效率相当低,也许是因为这个没向产品转换吧。



那个是纯软件的动态翻译,效率低。
发表于 2007-10-8 00:12:47 | 显示全部楼层
不好弄

本版积分规则

小黑屋|手机版|Archiver|Lemote Inc.  

GMT+8, 2019-5-20 14:35 , Processed in 0.181689 second(s), 16 queries .

快速回复 返回顶部 返回列表