## Ensure that contents of list sums up to 1 for np.random.choice()

The Context In Python 3.5, I'm making a function to generate a map with different biomes - a 2-dimensional list with the first layer representing the lines of the Y-axis and the items representing items along the X-axis. Example: [ ["A1", "B1",...
more »

## How to modify a Numpy 2D array in every row but specific column indexes?

This is what I am doing now to achieve what I want. In: a=numpy.zeros((3,2)) a[range(a.shape[0]),[0,0,1]] = 1 a Out: array([[ 1., 0.], [ 1., 0.], [ 0., 1.]]) As you can see, I used range function to select all the rows in a. Is t...
more »

## Fast way to count occurrences of all values in a pandas DataFrame

Suppose I have the following data: import pandas as pd import numpy as np import random from string import ascii_uppercase random.seed(100) n = 1000000 # Create a bunch of factor data... throw some NaNs in there for good measure data = {letter: [...
more »

## How to test whether one of two values present in numpy array or matrix column on single pass?

Take a matrix like the following: import numpy as np m = np.matrix([[1,1], [2,0], [3,1], [5,1], [5,0]]) Then take two test values: n1 = 4 n2 = 1 How can I test for both of them (it's g...
more »

## Hierarchical data: efficiently build a list of every descendant for each node

I have a two column data set depicting multiple child-parent relationships that form a large tree. I would like to use this to build an updated list of every descendant for each node. Original Input: child parent 1 2010 1000 7 2100 1...
more »

## How to create Numpy array using fancy indexing

How do use numpy's fancy indexing to create this, I would like the fastest performance: array([[ 1, 2, 3, 4, 16, 31], [ 2, 3, 4, 5, 17, 32], [ 3, 4, 5, 6, 18, 33], [ 4, 5, 6, 7, 19, 34], [ 5, 6, 7, 8, 20, 3...
more »

## Compute x**k with x, k being arrays of arbitrary dimensionality

I have two numpy arrays: One array x with shape (n, a0, a1, ...) and one array k with shape (n, b0, b1, ...). I would like to compute and array of exponentials such that the output has dimension (a0, a1, ..., b0, b1, ...) and out[i0, i1, ..., j0,...
more »

## How to resolve this equation in python?

I'm trying to resolve this matrix using python: a|b|c d|e|f g|h|i A+B+C=constant D+E+F=constant G+H+I=constant A+D+G=constant B+E+H=constant C+F+I=constant A+E+I=constant C+E+H=constant I have 9 variables + the constants These 9 variables are di...
more »

## How to calculate error in Polynomial Linear Regression?

I am trying to calculate the error rate of the training data I'm using. I believe I'm calculating the error incorrectly. The formula is as shown: y is calculated as shown: I am calculating this in the function fitPoly(M) at line 49. I believe ...
more »

## Selecting every n indices of an array

If I have an array that is 100 elements in length, what is the most Pythonic way to get every n indices. For example, if I wanted every 5 indices of an array a, how could I get an array b=[[0,1,2,3,4],[5,6,7,8,9],[10,11,12,13,14],...], where each el...
more »

## Numpy: An efficient way to merge multiple slices

I have a numpy array test_array = np.arange(100).reshape((4,25)) and I want to merge the following cols to form a new array 1:3, 2:4, 3:15, 2:24, 6:8, 12:13 I know this code will work np.hstack((test_array[:,1:3],test_array[:,2:4],test_array[:...
more »

## Windows/Python Error WindowsError: [Error 3] The system cannot find the path specified

Hi I am new to python and i need some help. I trying to run a file on Windows 10 OS with python 2.7. import os import re import codecs import numpy as np import theano models_path = "./models" eval_path = "./evaluation" eval_temp = os.path.join(e...
more »

## In NumPy, how to explain the return of ndarray[True]?

In NumPy, I know that we can index an ndarray by boolean values in the following way in a Python interpreter: >>> import numpy as np >>> b = np.arange(1, 6) >>> print(b) [1 2 3 4 5] >>> bi = np.array([True, False,...
more »

## How to use arrays/vectors in a Python user-defined function?

I'm building a function to calculate the Reliability of a given component/subsystem. For this, I wrote the following in a script: import math as m import numpy as np def Reliability (MTBF,time): failure_param = pow(MTBF,-1) R = m.exp(-failure_p...
more »

## Python Pandas Data frame creation

I tried to create a data frame df using the below code : import numpy as np import pandas as pd index = [0,1,2,3,4,5] s = pd.Series([1,2,3,4,5,6],index= index) t = pd.Series([2,4,6,8,10,12],index= index) df = pd.DataFrame(s,columns = ["MUL1"]) df["M...
more »

## How to conditionally combine two numpy arrays of the same shape

This sounds simple, and I think I'm overcomplicating this in my mind. I want to make an array whose elements are generated from two source arrays of the same shape, depending on which element in the source arrays is greater. to illustrate: import ...
more »

## How to select all non-NaN columns and non-NaN last column using pandas?

Forgive me if the title a little bit confusing. Assuming I have test.h5. Below is the result of reading this file using df.read_hdf('test.h5', 'testdata') 0 1 2 3 4 5 6 0 123 444 111 321 NaN NaN NaN 1 12 ...
more »

## Reshaping and stacking a 2D array to form a 3D array

I have a dataframe as below df = pd.DataFrame({'a':[1,1,1,2,2,2], 'b': [10, 20, 30, 20, 40, 60], 'c': [80, 80, 80, 120, 120, 120]}) I want to get 3D array array([[[ 1, 10, 80], [ 2, 20, 120] ], ...
more »

## Poor performance of C++ function in Cython

I have this C++ function, which I can call from Python with the code below. The performance is only half compared to running pure C++. Is there a way to get their performance at the same level? I compile both codes with -Ofast -march=native flags. I ...
more »

## Define an algorithm which gets a number and a list and returns a scalar based on number's distance to average of the list

Let's suppose that we have got a list which appends an integer in each iteration which is between 15, 32(let's call the integer rand). I want to design an algorithm which assigns a reward around 1 (between 1.25 and 0.75) to each rand. the rule for as...
more »

## Fastest way to sum over rows of sparse matrix

I have a big csr_matrix(1M*1K) and I want to add over rows and obtain a new csr_matrix with the same number of columns but reduced number of rows. Actually my problem is exactly same as this Sum over rows in scipy.sparse.csr_matrix. The only thing is...
more »

## How to create an increasing, multi-dimensional array in numpy

I am trying to perform inverse warping given a homography matrix, and to do that efficiently I need a numpy array that looks like this: ([[0, 0, 1], [0, 1, 1], [0, 2, 1], ... [1, 0, 1], [1, 1, 1], ... [n, p, 1]]) Where n is an image's width (im.sh...
more »

## How to reshape a multi-column dataframe by index?

Following from here . The solution works for only one column. How to improve the solution for multiple columns. i.e If I have a dataframe like df= pd.DataFrame([['a','b'],['b','c'],['c','z'],['d','b']],index=[0,0,1,1]) 0 1 0 a b 0 b c 1 ...
more »

## How to check if data is missing after two or more repeating values in pandas and replace missing value with previous value?

I am trying to fill missing values with previous value but only if the previous value is repeating? Sample DF: Index Columns 0 1978.0 1 1918.0 2 1918.0 3 1918.0 4 NaN 5 NaN 6 NaN 7 1853.0 8 1831.0 9 NaN fo...
more »

## Inconsistent python print output

(Python 2.7.12) - I have created an NxN array, when I print it I get the exact following output: Sample a: SampleArray=np.random.randint(1,100, size=(5,5)) [[49 72 88 56 41] [30 73 6 43 53] [83 54 65 16 34] [25 17 73 10 46] ...
more »

## ScikitLearn regression: Design matrix X too big for regression. What do I do?

I have a matrix X that has something like 7000 columns and 38000 rows. Thus it is a numpy array with shape (38000, 7000). I instantiated the model model = RidgeCV(alphas = (0.001,0.01, 0.1, 1) and then fitted it model.fit(X, y) where y is the ...
more »

## NumPy: Replace all elements along third dimension with averages in 3D array

I have a 3D array of the dimensions 400*800*3. I want to replace all the elements in the third dimension with the average of that dimension. Right now I have been able to accomplish this by using loops. test_data=np.random.randint(0,256,size=(400,80...
more »

## Python - Reading and Writing Structured Binary Files

Currently I'm trying to manipulate a kind of binary file, which structure is like this: FileHeader + [SectionHeader + Binary(1D-array)] * NumSetions After searching on the internet, I came up with the following code to read it: import numpy as np...
more »

## Efficient pairwise computation of identical elements in a large 2D numpy array

I have a 2D numpy array with hundreds of thousands of rows and a thousand or so columns (let's say it's a N x P array with N = 200,000, P = 1000). The goal here is to compute the number of identical elements between each pair of row vectors, ideally ...
more »

## Create mask by first positions only

I have array: a = np.array([[ 0, 1, 2, 0, 0, 0], [ 0, 4, 1, 35, 0, 10], [ 0, 0, 5, 4, 0, 4], [ 1, 2, 5, 4, 0, 4]]) I need select only from first consecutive 0 in each row: [[ True Fal...
more »

## A more efficient method of passing multiple conditions for each element in a 3D numpy array

I have a 3 dimensional numpy array that I am checking multiple conditions for. I am checking each element to see if they are less than a certain number. If each 3d element is indexed by i, where i=[0,1,2] in a what I call array3, and if one of the el...
more »

## Why is the dtype shown (even if it's the native one) when using floor division with NumPy?

Normally the dtype is hidden when it's equivalent to the native type: >>> import numpy as np >>> np.arange(5) array([0, 1, 2, 3, 4]) >>> np.arange(5).dtype dtype('int32') >>> np.arange(5) + 3 array([3, 4, 5, 6, 7...
more »

## Python 3: RuntimeWarning with numpy.power

When using numpy.power(2,N), where N is an integer, I encounter the following issue: In[1] np.power(2,63) Out[1] -9223372036854775808 RuntimeWarning: invalid value encountered in power and even more strangely, In[2] np.power(2,63)*2 Out[2] 0 Th...
more »

## Extracting and transforming data in numpy

Suppose I have the following numpy vector [[1, 3., 'John Doe', 'male', 'doc', '25'], ..., [9, 6., 'Jane Doe', 'female', 'p', '28']] I need to extract relevant to my task data. Being a novice in numpy and python in general, I would do it in t...
more »