Abstract : It is commonly observed that production grids are inherently unreliable. The aim of this work is to improve grid application performances by tuning the job submission system. A stochastic model, capturing the behavior of a complex grid workload management system is proposed. To instantiate the model, detailed statistics are extracted from dense grid activity traces. The model is exploited for optimizing a simple job resubmission strategy. It provides quantitative inputs to improve job submission performance and it enables the impact of faults and outliers on grid operations to be quantified.