Drew Marsh posted an interesting article entitled "Avalon Dissected." It is great to see people taking an interest in Avalon at this level. However, he got some of the details around Visuals wrong.
The graphics system in Avalon is new and is built to take advantage of the hardware. Beyond this, there is a software engine for the cases when the hardware isn't supported. On top of this, we've built a "scene graph" system that natively supports animation and remoting. In the normal case, the actual rendering of the scene graph is done on a dedicated composition thread where third party code isn't allowed to run. The description of everything on screen and some primitive animation information is passed to this thread so that the rendering can happen independently of what is going on with the regular UI thread. In the case when you are running an Avalon application from a remote machine (via Terminal Services or Remote Desktop) with an advanced client, this composition thread will run on the remote machine and the actual rendering of the scene graph will happen without any network traffic. Some animations will be able to run on the remote machine without network traffic also.
From the programming model point of view, a node in our scene graph is called Visual. There are different types of visuals that are intended to be used for different purposes. It is also important to note that UIElement derives from Visual so every UIElement is-a Visual. (At one point an Element had-a visual but this was changed with the rearchitecture that ChrisAn talks about.)
Every Visual implements an interface called IVisual. There is no situation where a third user should be implementing IVisual. The reason that we created this interface is so that we can use explicit interface implementation to hide the methods on IVisual. We didn't want to pollute the public object model for UIElement (and Button, etc.) with things that most users weren't going to be using.
A Visual has the following capabilities:
For each visual, the content of that visual is rendered followed by the children, from left to right. The Z-order of any children are completely implied by their order in the visual tree.
Different specializations of Visual provide for specific types of content and for different flavors of public programming model. Here are the main ones:
When it comes time to render (scheduled via the Dispatcher class and coordinated with the animation system and layout) the system will walk the tree and decide what is and isn't on screen. It will then call render on any RetainedVisuals as necessary and cache the result. It also updates the data on the other thread for what is on screen so that the render can happen async from what is going on on the UI thread. In this way, there is no "render" happening on the UI thread in the normal case. Instead we are "compiling" the scene graph down to a simpler representation to run asynchronously on another thread or on another machine.
The bits that we shipped in the PDC don't have this system fully implemented. For example, the composition thread isn't running completely async and we don't have rich TS support in. Come beta, more of our system will be completed and up and working.
I'm sure that this is confusing and that I've missed some stuff, so feel free to ask questions and I'll do my best to answer.