In SYMPHONY, problem variables are represented by a unique global index assigned to each variable by the user. This index represents each variable's position in a “virtual” global list known only to the user. The main requirement of this indexing scheme is that, given an index and a list of active cuts, the user must be able to generate the corresponding column to be added to the matrix. As an example, in problems where the variables correspond to the edges of an underlying graph, the index could be derived from a lexicographic ordering of the edges (when viewed as ordered pairs of nodes).
This indexing scheme provides a very compact representation, as well as a simple and effective means of moving variables in and out of the active set. However, it means that the user must have a priori knowledge of all problem variables and a method for indexing them. For combinatorial models such as the Traveling Salesman Problem, this does not present a problem. However, for some set partitioning models, for instance, the number of columns may not be known in advance. Even if the number of columns is known in advance, a viable indexing scheme may not be evident. Eliminating the indexing requirement by allowing variables to have abstract, user-defined representations (such as we do for cuts), would allow for more generality, but would also sacrifice some efficiency. A hybrid scheme, allowing the user to have both indexed and algorithmic variables (variables with user-defined representations) is planned for a future version of SYMPHONY.
For efficiency, the problem variables can be divided into two sets, the base variables and the extra variables. The base variables are active in all subproblems, whereas the extra variables can be added and removed. There is no theoretical difference between base variables and extra variables; however, designating a well-chosen set of base variables can significantly increase efficiency. Because they can move in and out of the problem, maintaining extra variables requires additional bookkeeping and computation. If the user has reason to believe a priori that a variable is “good” or has a high probability of having a non-zero value in some optimal solution to the problem, then that variable should be designated as a base variable. It is up to the user to designate which variables should be active in the root subproblem. Typically, when column generation is used, only base variables are active. Otherwise, all variables must be active in the root node.