Anteru's blog
  • Consulting
  • Research
    • Assisted environment probe placement
    • Assisted texture assignment
    • Edge-Friend: Fast and Deterministic Catmull-Clark Subdivision Surfaces
    • Error Metrics for Smart Image Refinement
    • High-Quality Shadows for Streaming Terrain Rendering
    • Hybrid Sample-based Surface Rendering
    • Interactive rendering of Giga-Particle Fluid Simulations
    • Quantitative Analysis of Voxel Raytracing Acceleration Structures
    • Real-time Hybrid Hair Rendering
    • Real-Time Procedural Generation with GPU Work Graphs
    • Scalable rendering for very large meshes
    • Spatiotemporal Variance-Guided Filtering for Motion Blur
    • Subpixel Reconstruction Antialiasing
    • Tiled light trees
    • Towards Practical Meshlet Compression
  • About
  • Archive

Intel Larrabee instruction set is ready-to-use

March 25, 2009
  • Hardware
approximately 1 minutes to read

Intel just released a C++ header which allows developers to use Larrabee instructions with current compilers, by simply writing out the future instrisincs as C code. Some interesting bits:

  • 512 bit vector types (8x64bit double, 16x32bit float, or 16x32bit integer)
  • Lots of 3-operand instructions, like a=a*b+c
  • Most combinations of +,-,*,/ are provided, that is, you can choose between a*b-c, c-a*b and so forth.
  • Some instructions have built-in constants: 1-a*b
  • Many instructions take a predicate mask, which allows to selectively work on just a part of the 16-wide vector primitive types
  • 32bit integer multiplications which return the lower 32bit of the result
  • ~~Lots of~~ trigonometric functions [Update] They don’t say which ones map directly to instructions, and provide them only for the sake of completeness.
  • Bit helper functions (scan for bit, etc.)
  • Explicit cache control functions (load a line, evict a line – that would have been helpful on a project I worked on once)
  • Horizontal reduce functions: Add all elements inside a vector, multiply all, get the minimum, and logical reduction (or, and, etc.).

Especially the reduce functions look interesting, as they are more general than the dot-product instruction available in SSE. Nothing revolutionary though, but all in all it looks like a very nice and useful instruction set, although I was hoping for 8-bit instructions as well (with 8 bit components, and RGBA, you could process 4x4 pixels at once – that would be a real killer for image processing).

Update: The instructions are 3 operand only, storing the result in the first operand!

Previous post
Next post

Recent posts

  • Data formats: Why CSV and JSON aren't the best
    Posted on 2024-12-29
  • Replacing cron with systemd-timers
    Posted on 2024-04-21
  • Open Source Maintenance
    Posted on 2024-04-02
  • Angular, Caddy, Gunicorn and Django
    Posted on 2023-10-21
  • Effective meetings
    Posted on 2022-09-12
  • Older posts

Find me on the web

  • GitHub
  • GPU database
  • Projects

Follow me

Anteru NIV_Anteru
Contents © 2005-2025
Anteru
Imprint/Impressum
Privacy policy/Datenschutz
Made with Liara
Last updated February 03, 2019