This post was migrated from an older system. Some links may point to content that no longer exists. Comments were not migrated.
How many of you have started using Sandboxed Solutions? What's the first thing you do in order to make them work? Start the Sandboxed Code Service.
For the most part, once the service is started, most folks don't think twice about the configuration. That's pretty nice in many ways - it clearly shows the service is pretty easy to activate and use.
What about optimizing the service?
Tweaking the configuration is rarely done. Why? Quite honestly, it's probably due to the following:
a. Little information on what can be changed and how it should be
b. No clear guidance on how to collect metrics to validate those changes
Today we'll dive into the details of the service so that we can understand what can be changed and why you may want to consider changing the default values. Validation of those changes (i.e. metrics gathering) will be saved for another post.
Let's look at the basic building blocks of the Sandboxed Code Service. There are primarily 3 constructs that you should be aware of…
- Sandboxed Code Service can be run on any member of the SharePoint farm.
- The Sandboxed Code Service uses Tiers to define where code is run. By default, SharePoint creates one tier where all code is run.
- A Sandbox Tier determines how many resources (# of worker processes, connections, appdomains) are provided for code execution.
Let's dive into the details of each of these elements...
The first and easiest piece of the puzzle is the fact that the Sandboxed Code Service is in fact a service. This effectively means the service can run on any member of the SharePoint farm. Starting and stopping the service is fairly straight forward.
The key point to take away here is isolation. The server that is running the sandboxed code can be different than the web server that is actually serving the page request.
The service will load balance requests for sandbox execution amongst all the servers that are running the service.
Once the service is up and running, the Sandboxed Code Service relies on Tiers to determine where code is executed.
Think of Sandbox Tiers as a resource pool. A Tier defines the following characteristics:
• Maximum worker processes
• Maximum application domains
• Maximum connections per process
• Maximum resource value
By default, SharePoint creates 1 tier which will serve all sandboxed execution requests (per the maximum resource value). The default tier also defines 1 worker process and 1 connection per process.
Once a load balanced request is sent to server N, the Sandboxed Code Service on server N will then determine which Tier the execution request will serve that Tier. The tier is chosen based on the maximum resource value (reference: ResourceMaxValue
property). Please note, the current msdn text indicates the max value should be less than or equal to 1 but that is not the case.
|As mentioned in the preceding section, Tiers can be conceptually thought of as resource pools. Each Tier allows administrators to how sandboxed solutions are executed.
In short, adjusting the resource values of a Tier is an exercise in risk management. What do you provide to "good" solutions versus "bad" solutions? How much are willing to loose if something forces the sandbox worker processes to crash?
Before we get too deep into why/when/where of Tiers, let me throw out an important fact - the only way to manage the Sandbox configuration is via PowerShell. The Central Admin UI only allows you to start and stop the service. As a reference, the remainder of this blog post will reference the following PowerShell script…
Why are Tiers important? In other words, what is wrong with 1 tier?
Technically, there is nothing wrong with using only 1 Tier. As I mentioned earlier, the default configuration sets up 1 tier that is used by all user solutions.
However, if you really want to scale up your sandboxed operations, creating and managing multiple Tiers is a requirement. Tiers were specifically designed to address "bad code" scenarios. It will be your job, as an administrator, to ensure "bad code" has the least impact on your users and system.
Let's try illustrate the risk management problem is by looking at a few different examples…
Scenario 1: You simply bump up the value of MaximumConnectionsPerProcess from the default value of 1, but leave everything else as is. What happens if the worker process crashes and all available connections are being used when the crash occurs? Well, whatever those users were doing is now completely non-operational and/or has been left in a bad state (some work was done before the crash).
Scenario 2: You have code that historically takes a long time complete and is highly likely to time out. The Sandbox will time out the process, but what happens if the Maximum Worker Process and Maximum Connections per Process values are low? The user experience is probably going to be sluggish especially with high user requests. Not only will existing users have to wait for the existing processes to complete but new users may run into a scenario where there is no opportunity to execute because all processes and connections are taken.
Understanding how a Tier's ResourceMaxValue comes into play is crucial. Once a server has received a request to run code in the sandbox (i.e. the load balancer has already made a decision who is running the code), the target server must then determine which tier is actually responsible for executing the code. SharePoint keeps a tally on the average # of resource points that each user solution consumes. Tiers effectively allow you to balance "good" solutions into what is considered "high density" configurations (tiers with lots of worker processes, connections, app domains) and "bad" solutions can be served by "low density" configurations (tiers with fewer worker processes, connections, app domains).
For example, let's say that you have 3 Tiers called A, B, C. Those tiers have ResourceMaxValue set to .25, .50, and Int32.MaxValue, respectively. A solution with an average resource value of .10 points per request is queued, it will run in Tier A. A solution with .40 points per request will run in Tier B. Any solution with more than .50 points per request would be served by Tier C. Heavy operations drive up your average resource consumption; in other words, smaller values represent solutions that have least impact. In this case, we'd configure Tier A to have the highest density, Tier C would have the lowest density, and Tier B would probably be somewhere in between A and C.
Where should you start? The current line of thought is to at least increase the worker process count to the # of cpu cores plus 1, especially for those servers that are dedicated sandbox servers. From there, the task of defining how many tiers and how they are individually configured is much more involved.
In an upcoming post, I'll try to tackle how best to monitor solutions and the Sandboxed Code Service to help you effectively setup your Tiers.