# Subset-AVG – Finding a subset of List Which Matches Known Rational Number

I’ve asked this on math overflow and used comments to clarify/overstate my question. I hope it has the intended effect and doesn’t come off as jarring.

I’m attempting to find what subset of numbers reach a known average.

I have a list of known values, negative and possible decimals. They look something like this {-.32,-.64,-.12,.08,-.54,-.43, …}

It’s around 50 numbers in some cases, though this problem would be tested for other cases as well.

The set mostly contains negative decimal numbers, while in rare cases, has a few positive decimals – it never has whole numbers.

I also have a known value, which I know to be the average of some subset of the above list.

The known value is similar to -.03.

I’m uncertain of the grouping mechanism used, but seemed to reach stack overflow trying to solve this problem when not grouping.

I’ve attempted a few ways of going about solving this problem. I’m using Python 3.6 and imported numpy as np.

I’m wondering if the “subset-avg” code I’ve adapted from another solution for subset-sum here (I’ll give due credit when i can find that question again) is not the most efficient way/if there’s any huge mistake in my even attempting to resolve this that I haven’t seen.

Thanks in advance for any thoughts.

```def subset_avg(numbers, target, partial=[],depth=1):
# create AVG function
# set average of partial
a = np.mean(partial)
# check if the partial sum is equals to target
if a != target:
print("Currently Testing the Following Subset's " " " + "Average(%s)  =  %snn" % (partial, round(a,2)))
print(depth)
if a == target or round(a,2) == target:
print('nn')
print("************")
print("************")
print('nn')
print("Found Subset AVG " + "Average(%s)  =  %s" % (partial, target))
print('nn')
print("************")
print("************")
print('nn')
print(depth)
# for each number in range of list
for i in range(len(numbers)):
# set n = current iteration in list
n = numbers[i]
# remaining values is current iteration + 1 through end of list
remaining = numbers[i+1:]
# calculate mean of partial, set partial = partial plus n
subset_avg(remaining, target, partial + [n],depth+1)
# Example of use
x = [-.32,-.64,-.12,.08,-.54,-.43]
subset_avg(x,-.03)
```

Here is a solution I adapted from a subSet sum algorithm I posted for another question (here). Since the algorithm loops through potential solution sizes, it was easy to adapt it to search for an average.

The `iSubSum()` function takes 3 parameters: The target average, the list of values and an optional rounding precision parameter. It is a generator so it will produce all possible solutions when used in a loop. You can also get the first solution quickly using the `next()`function. This should produce results much faster than a brute force approach, especially for large lists.

The function is based on a modified version of a subset-sum algorithm that returns solutions as lists of indices. This is intended to distinguish combinations that have duplicate values coming from different indices in the original list.

```from bisect import bisect_right
from itertools import accumulate
def iSubAverage(M,A,P=0):
smallSize     = 20
smallSums     = set()
def subSumForSize(S,A,size,failedSums=None):
nextSum = A[size-2][2] if size>1 else 0
index   = bisect_right([a for a,_,_ in A],S-nextSum) # max element for target
A       = A[:index]
if len(A) < size:    return                  # not enough elements for size
if A[size-1][2]  > S: return                 # minimum sum > target
maxSum = A[-1][2]
if len(A) > size: maxSum -= A[-size-1][2]
if maxSum < S:  return                       # maximum sum < target
if len(A) <= smallSize and S not in smallSums: return
if failedSums is None: failedSums = set()
while index >= size:
index -= 1
a,i,ca = A[index]
if size == 1:
if a == S: yield [i]
continue
c0 = A[index-size][2] if index>size else 0
if ca-c0 < S: break
subS = S-a
if subS in failedSums: continue # known unreachable sum
failed = True
for result in subSumForSize(subS,A[:index],size-1,failedSums):
yield result+[i]
failed = False
if not A: return
if M < 0: M,A = -M,[-a for a in A] # must have positive target
offset = max(0,-min(A)) # circumvent negatives (requires loop on sizes)
A      = sorted([(round(a+offset,P),i) for i,a in enumerate(A)])
cumA   = accumulate(a for a,i in A)
A      = [(a,i,ca) for (a,i),ca in zip(A,cumA)]
for a,_,_ in A[:smallSize]:
newSums = [a+s for s in smallSums] + [a]
smallSums.update(newSums)
for size in range(1,len(A)+1):
subS  = round(M*size,P)
if subS != round(M*size,P*2): continue # fractional numerator
subS += round(offset*size,P)
for result in subSumForSize(subS,A,size):
yield result
```

To get the actual values, the `iSubAvg()` function maps indices to the corresponding values in the list:

```def iSubAvg(M,A,P=0):
for iA in iSubAverage(M,A,P):
yield sorted([A[i] for i in iA])
L       = [-.32,-.64,-.12,.08,-.54,-.43]
targetL = -0.02
for solution in iSubAvg(targetL,L,2):
print(solution)
# [-0.12, 0.08]   (there isn't a solution for -0.03)
K = [0.72, 0.69, 0.81, -0.28, 0.6, 0.59, 0.77, 0.46, 0.36, 0.66, 0.88, 0.88, 0.9, -0.24, 0.5, -0.5, 0.46, 0.96, -0.22, -0.8, -0.13, 0.87, 0.78, 0.2]
targetK = -0.02
for solution in iSubAvg(targetK,K,2):
print(solution)
# [-0.5, 0.46]
# [-0.5, 0.46]
# [-0.8, -0.22, 0.96]
# [-0.5, -0.28, 0.72]
# [-0.28, -0.24, 0.46]
# [-0.28, -0.24, 0.46]
# [-0.5, -0.24, 0.2, 0.46]
# [-0.5, -0.24, 0.2, 0.46]
# [-0.8, -0.28, -0.24, -0.22, 0.46, 0.96]
# [-0.8, -0.28, -0.24, -0.22, 0.46, 0.96]
next(iSubAvg(0.165,K,2))
# [-0.8, -0.28, -0.24, 0.66, 0.69, 0.96]
```

note that the function returns all combinations including repetitions for duplicate values in the source list. You can filter out these duplicates if you don’t need them